Comparing ML libraries

Building elements

The following table gives a summary of all the functions used in this package. Detailed description on how to convert each function from PyTorch to TensorFlow can be found further below. For consistent comparison across all modules, we use an input dataset with batch size of 20, total number of channels of 3 and both height and width of 32 pixels.

Name

PyTorch

Tensorflow

Fully-Connected Layer

torch.nn.Linear

tf.keras.layers.Dense

2D Convolutional Layer

torch.nn.Conv2d

tf.keras.layers.Conv2D

Categorical Cross-Entropy

torch.nn.CrossEntropyLoss

tf.keras.losses.CategoricalCrossentropy
tf.keras.losses.SparseCategoricalCrossentropy

Attention

Remember, the input data shape in PyTorch is different than the input data shape in Tensorflow. In PyTorch, the order of the dimensions is [batch_size,channels,height,width] , while in TensorFlow the channels dimension is placed at the end making the order [batch_size,height,width,channels] .

Fully-Connected Layer

A key element when building neural networks, the fully-connected layer connects all neurons from a given layer to each output nodes of the next layer. Below we show how to execute such layer for both libraries:

In PyTorch

>>> import torch
>>> data = torch.randn(20, 3, 32, 32).reshape(20, -1)
>>> model = torch.nn.Linear(in_features=3*32*32, out_features=100)
>>> print('Input shape:',data.shape,'\nOutput shape:',model(data).shape)
Input shape: torch.Size([20, 3072])
Output shape: torch.Size([20, 100])

In TensorFlow

>>> import tensorflow as tf
>>> data = tf.reshape(tf.random.normal((20, 32, 32, 3)), (20, -1))
>>> model = tf.keras.layers.Dense(units=100, activation='relu', input_shape=data.shape[1:])
>>> print('Input shape:',data.shape,'\nOutput shape:',model(data).shape)
Input shape: (20, 3072)
Output shape: (20, 100)

Tip

In PyTorch, the torch.nn.Linear class always requires the size of the input data sample to be specified. However, when such layer in including in a sequence (using the torch.nn.Sequential class), this parameter is usually not known. A good workaround is to use the torch.nn.LazyLinear definition which only requires the output size (that is, the size of the actual layer) to be defined, the input data sample size being automatically inferred.

2D Convolutional Layer

When building Convolutional Neural Networks (CNN), the 2D convolutional layer is used to significantly reduce the shape of the input data samples and total number of parameters being trained through the neural network. Below we show how the module can be used from each libraries. In the following code snippet, we feed the example input dataset to a convolutional layer with kernel size of 3 and stride of 2 for both height and width dimensions:

In PyTorch

>>> import torch
>>> data = torch.randn(20, 3, 32, 32)
>>> model = torch.nn.Conv2d(in_channels=3, out_channels=6, kernel_size=(3,3), stride=(2,2))
>>> print(model(data).shape)
torch.Size([20, 6, 15, 15])

In TensorFlow

>>> import tensorflow as tf
>>> data = tf.random.normal((20, 32, 32, 3))
>>> model = tf.keras.layers.Conv2D(filters=6, kernel_size=(3,3), activation='relu', strides=(2,2), input_shape=data.shape[1:])
>>> print(model(data).shape)
(20, 15, 15, 6)

Categorical Cross-Entropy

When performing image classification, it is common to use the Categorical Cross Entropy loss function to evaluate the accuracy of the model.

In PyTorch

>>> import torch
>>> y_true = torch.tensor([1, 2])
>>> y_pred = torch.tensor([[0.05, 0.95, 0], [0.1, 0.8, 0.1]])
>>> loss = torch.nn.CrossEntropyLoss()
>>> loss(y_pred, y_true).item()
0.9868950843811035

In TensorFlow

>>> import tensorflow as tf
>>> y_true = [1, 2]
>>> y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.1]]
>>> loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
>>> loss(y_true, y_pred).numpy()
0.9868951

Danger

Beware of the order of the input argument between PyTorch and TensorFlow which is reverse. The PyTorch function should be fed the predicted array first and then the true labels while the TensorFlow method should be fed the true labels first and then the predicted array.

Training & Inference

PyTorch

class hyppo.dnnmodels.pytorch. PyTorchTrainer ( data = None , hyperprms = None , dl_type = None , debug = False , output = 'out001' , log_dir = 'logs' , device = device(type='cpu') , device_ids = [] , split = 'data' , mcd = False , ** kwargs ) [source]

Methods Summary

__init__ ([data, hyperprms, dl_type, debug, ...])

PyTorch training method.

train ([itrial, trial, accuracy, validate, ...])

Parameters :

evaluate ([accuracy, test, update, mcd])

get_accuracy (output, label)

Methods Documentation

__init__ ( data = None , hyperprms = None , dl_type = None , debug = False , output = 'out001' , log_dir = 'logs' , device = device(type='cpu') , device_ids = [] , split = 'data' , mcd = False , ** kwargs ) [source]

PyTorch training method. This will train built-in models available from HYPPO. The built-in model of your choice can be specified using the dl_type variable.

Parameters :
data dict

Input data sets.

hyperprms dict

Input set of hyperparameter values.

dl_type str

Type of deep learning architecture to use.

debug bool

Flag to estimate full training processing time.

device torch.device

Processor type to be used (CPU or GPU)

device_ids list

Rank of current processor used (used for data splitting)

output str

Output directory name to save results.

log_dir str

Relative path where log files are stored

split str

What will be split across available resources

Returns :
The following outputs are stored and returned in the form of a dictionary:
loss float

Final training loss

models str

Path to saved trained model

data dict

Input data

criterion torch.nn.modules.loss

Loss function used for training

hyperprms dict

Dictionary of hyperparameter values

train ( itrial = 0 , trial = 1 , accuracy = False , validate = True , rank = 0 , split = 'data' , debug = False , hyperprms = None , ** kwargs ) [source]
Parameters :
itrial int

Index of current trial.

trial int

Number of independent trials to run.

accuracy bool

Whether to calculate estimate, e.g., for classification.

validate bool

Use validation dataset to evaluate the model at the end of each epoch.

rank int

Current processor rank

evaluate ( accuracy = False , test = False , update = False , mcd = False , ** kwargs ) [source]
get_accuracy ( output , label ) [source]

TensorFlow

hyppo.dnnmodels.tensorflow. train ( itrial = 0 , data = None , hyperprms = None , dl_type = None , output = None , debug = False , trial = 1 , validate = True , accuracy = False , device = None , device_ids = None , log_dir = 'logs' , split = 'data' , ntasks = 1 , rank = 0 , ** kwargs ) [source]

TensorFlow training method.

hyppo.dnnmodels.tensorflow. evaluate ( data , model , loss_function , update = False , ** kwargs ) [source]