Dataset Preparation ¶
When using built-in models, the user must specify which data to be used. In this section, we describe how custom or public datasets can be loaded.
Public Datasets ¶
Daily Temperatures ¶
CIFAR10 dataset ¶
- hyppo.datasets.cifar10. get_data ( library = None , data_path = './data' , ** kwargs ) [source] ¶
-
Loading CIFAR10 dataset . This image dataset consists of 60,000 32x32 colour images across 10 classes and can be used to test image classification problems. Depending on which library is being used, this function will load the public dataset accordingly.
- Parameters :
- Returns :
-
-
data
dict
-
Training, validation and testing datasets.
-
data
Examples
>>> from hyppo.dataset.cifar10 import get_data >>> get_data(library='pt') {'dataset': 'cifar10', 'train': <torch.utils.data.dataset.Subset at 0x11deb3090>, 'valid': <torch.utils.data.dataset.Subset at 0x11deb3fd0>, 'test': Dataset CIFAR10 Number of datapoints: 10000 Root location: ./data Split: Test StandardTransform Transform: Compose( ToTensor() )}
Warning
When using the CIFAR10 dataset for image classification, it is important to remember that the classifcation is done over 10 classes. While building the neural network, the size of the output layer should therefore contain 10 neurons, one for each class. Also, the Cross-Entropy Loss will be used to do the training.