Neural Networks

What is backpropagation? How does it work? Why do we need it?#

Which optimization techniques for training neural nets do you know?#

How do we use SGD (stochastic gradient descent) for training a neural net?#

What’s the learning rate?#

The learning rate is an important hyperparameter that controls how quickly the model is adapted to the problem during the training. It can be seen as the "step width" during the parameter updates, i.e. how far the weights are moved into the direction of the minimum of our optimization problem.

What happens when the learning rate is too large? Too small?#

A large learning rate can accelerate the training. However, it is possible that we "shoot" too far and miss the minimum of the function that we want to optimize, which will not result in the best solution. On the other hand, training with a small learning rate takes more time but it is possible to find a more precise minimum. The downside can be that the solution is stuck in a local minimum, and the weights won't update even if it is not the best possible global solution.