Hyperparameter Set ¶
Available Hyperparameters ¶
Parameter |
Type |
Description |
Default |
Models / Applications |
---|---|---|---|---|
|
Activation function after each hidden layer |
|
CNN / LSTM / MLP / RNN |
|
|
Number of sample in single batch |
8 |
CNN / LSTM / MLP / RNN |
|
|
Dropout rate (between 0 and 1) |
0 |
CNN / LSTM / MLP / RNN |
|
|
Number of training epochs |
10 |
CNN / LSTM / MLP / RNN |
|
|
Factor by which the number of channels is multiplied for each channel. |
2 |
CNN |
|
|
Activation function for post-convolution fully-connect layer |
|
CNN |
|
|
Dropout rate (between 0 and 1) for post-convolution fully-connected layer |
0 |
CNN |
|
|
Number of flattened nodes in post-convolution fully-connected layer |
10 |
CNN |
|
|
Number of filters per hidden layer |
2 |
CNN |
|
|
Kernel size for each hidden layer |
2 |
CNN |
|
|
Time lag |
0 |
Time-series forecasting |
|
|
Number of hidden layers |
1 |
CNN / LSTM / MLP / RNN |
|
|
Loss function |
|
CNN / LSTM / MLP / RNN |
|
|
2D max pooling (applied only if it is not 0) |
0 |
CNN |
|
|
Number of nodes per hidden layer |
10 |
LSTM / MLP / RNN |
|
|
Optimizer function |
|
CNN / LSTM / MLP / RNN |
|
|
Padding |
|
CNN |
|
|
Inner recurrent activation which actualizes inner memory cell |
|
LSTM / RNN |
|
|
Inner recurrent dropout rate (between 0 and 1) |
0 |
LSTM / RNN |
|
|
Strides of the convolution along the height and width. |
1 |
CNN |
Note
Some hyperparameters such as
activation
,
dropout
,
nodes
,
filter
,
kernel
,
maxpool
,
padding
and
recurrent_activation
are currently layer-independent, meaning that the variable will be applied to each layer. In order to ease the creation of the neural network, these hyperparameters will be converted into a list replicating the value by the number of hidden layers.
Categorical hyperparameters
The
activation
,
fc_activation
,
loss
,
optimizer
,
padding
and
recurrent_activation
hyperparameters are currently fixed parameter. The user can specify a different value from the configuration file. In the section below, we list the available options for both activation and loss functions as well as the optimizer.
Categorical Hyperparameters ¶
Activation functions ¶
Algorithm |
Parameter |
PyTorch function |
Tensorflow function |
---|---|---|---|
Exponential Linear Unit |
|
||
Exponential |
|
n/a |
|
Hard sigmoid |
|
||
Linear (pass-through) |
|
||
Rectified Linear Unit |
|
||
Scaled Exponential Linear Unit |
|
||
Sigmoid |
|
||
Softmax |
|
||
Softplus |
|
||
Softsign |
|
||
Tanh |
|
Loss functions ¶
Algorithm |
Parameter |
PyTorch function |
Tensorflow function |
---|---|---|---|
Binary Cross Entropy* |
|
||
Categorical Cross Entropy |
|
tf.keras.losses.CategoricalCrossentropy
|
|
Categorical Hinge |
|
n/a |
|
Cosine Similarity |
|
n/a |
|
Hinge, |
|
n/a |
|
Kullback-Leibler divergence |
|
||
Logarithm of the hyperbolic cosine of the prediction error |
|
n/a/ |
|
Mean Absolute Error |
|
||
Mean Absolute Percentage Error |
|
n/a/ |
|
Mean Squared Error |
|
||
Mean Squared Logarithmic Error |
|
n/a |
|
Poisson Loss |
|
n/a |
|
Squared hinge |
|
n/a |
One-hot label representation
When doing image classification with Tensorflow and when the output labels from the dataset are integers (like it is the case with the CIFAR10 dataset), the
Sparse Categorical Crossentropy
loss function should be used instead of the standard Categorical Cross Entropy (see
here
for more information). To enforce such loss function to be used, the
loss
parameter within the configuration file can be set to
sparse_categorical_crossentropy
(see
this section
on how to do it).
Optimizers ¶
Algorithm |
Parameter |
PyTorch function |
Tensorflow function |
---|---|---|---|
Adadelta |
|
||
Adagrad |
|
||
Adaptive Movement Estimation |
|
||
Adamax |
|
||
Nesterov-accelerated Adaptive Moment Estimation |
|
n/a |
|
Root Mean Square Propagation |
|
||
Stochastic Gradient Descent |
|
Software Modules ¶
Save/Load into picke file ¶
Complete hyperparameter set ¶
- hyppo.hyperparams. set_hyperparams ( random_set , library , ** kwargs ) [source] ¶
-
This function merges into a single dictionary the hyperparameters that will be evaluated with the one that will be fixed. Pre-defined default values will be used for the fixed hyperparameters. The hardcoded default values can be changed by the user through the configuration file using the default option.
- Parameters :
-
-
random_set
dict
-
Random values for each hyperparameter to be evaluated
-
random_set
- Returns :
-
-
hpo_set
dict
-
Complete hyperparameter set
-
hpo_set
Examples
>>> from hyppo.hyperparams import set_hyperparams >>> set_hyperparams({'layers':3,'nodes':20},library='tf') {'activation': ['relu', 'relu', 'relu'], 'batch': 8, 'dropout': [0, 0, 0], 'epochs': 10, 'fc_activation': 'relu', 'fc_dropout': 0, 'fc_nodes': 10, 'filter': [2, 2, 2], 'kernel': [2, 2, 2], 'lag': 0, 'layers': 3, 'loss': 'mean_squared_error', 'loss_args': {}, 'maxpool': [0, 0, 0], 'nodes': [20, 20, 20], 'opt_args': {}, 'optimizer': 'Adam', 'padding': ['valid', 'valid', 'valid'], 'recurrent_activation': ['relu', 'relu', 'relu'], 'recurrent_dropout': 0, 'stride': 1}