In this article I compare the learning curves of 2 different topologies of MLP when modifying the hyperparameters η and α of the Gradient Descent algorithm used to learn the XOR function. This article follows another one (see Gradient Descent Algorithm: Impact of η and α on the learning curve of MLP for XOR )…
Tag: Gradient Descent algorithm
Gradient Descent Algorithm: Impact of eta and alpha on the learning curve of MLP for XOR
In their famous article (see Learning Representations by Back-propagating Errors) Rumelhart, Hinton and Williams popularized the backpropagation mechanism in order to modify the weights of the links between neurons of different layers. They also added the momentum (parameter α alpha) to the classical learning rate (η eta) as a way to improve the Gradient Descent…