Can you name and explain a few hyperparameters used for training a neural network?

Question

Can you name and explain a few hyperparameters used for training a neural network?

1 Answer

rahuljain1 · Answer 1 · 2023-07-21T00:02:58+0000

Hyperparameters are any parameter in the model that affects the performance but is not learned from the data unlike parameters ( weights and biases), the only way to change it is manually by the user.

Number of nodes: number of inputs in each layer.

Batch normalization: normalization/standardization of inputs in a layer.

Learning rate: the rate at which weights are updated.

Dropout rate: percent of nodes to drop temporarily during the forward pass.

Kernel: matrix to perform dot product of image array with

Activation function: defines how the weighted sum of inputs is transformed into outputs (e.g. tanh, sigmoid, softmax, Relu, etc)

Number of epochs: number of passes an algorithm has to perform for training

Batch size: number of samples to pass through the algorithm individually. E.g. if the dataset has 1000 records and we set a batch size of 100 then the dataset will be divided into 10 batches which will be propagated to the algorithm one after another.

Momentum: Momentum can be seen as a learning rate adaptation technique that adds a fraction of the past update vector to the current update vector. This helps damps oscillations and speed up progress towards the minimum.

Optimizers: They focus on getting the learning rate right.

Adagrad optimizer: Adagrad uses a large learning rate for infrequent features and a smaller learning rate for frequent features.

Other optimizers, like Adadelta, RMSProp, and Adam, make further improvements to fine-tuning the learning rate and momentum to get to the optimal weights and bias. Thus getting the learning rate right is key to well-trained models.

Can you name and explain a few hyperparameters used for training a neural network?

Please log in or register to answer this question.

1 Answer