GAURAV KUMAR, Resource Person & Trainer: Conversion of Own Collected Dataset to Benchmark Dataset

The researchers in data science, machine learning, deep learning and related approaches work on different types of datasets. Traditionally, there is need to work with the benchmark dataset so that the validation can be done and outputs will be accepted.

Many times, the researchers collect their own datasets and then use it for implementation of algorithm.

To convert the own collected data to benchmark data, following should be implemented and the dataset should be having specific properties

The dataset should be focused towards a specific type of machine learning task
The dataset should be open without any restrictions on download by other researchers
The dataset should be having sufficient features so that training, testing and validation can be done
The dataset should be accessible by other researchers and practitioners so that they can validate the outcomes
The dataset should be having labels for identification of attributes
The dataset should be clean from mismatch and without missing values
The dataset should not be very huge is size
There should be proper documentation of the dataset with its details of attributes

GAURAV KUMAR, Resource Person & Trainer

Pages

Saturday, April 18, 2020

Conversion of Own Collected Dataset to Benchmark Dataset

No comments:

Post a Comment