In this article I compare the learning curves of 2 different topologies of MLP when modifying the hyperparameters η and α of the Gradient Descent algorithm used to learn the XOR function. This article follows another one (see Gradient Descent Algorithm: Impact of η and α on the learning curve of MLP for XOR )…
Month: May 2021
Gradient Descent Algorithm: Impact of eta and alpha on the learning curve of MLP for XOR
In their famous article (see Learning Representations by Back-propagating Errors) Rumelhart, Hinton and Williams popularized the backpropagation mechanism in order to modify the weights of the links between neurons of different layers. They also added the momentum (parameter α alpha) to the classical learning rate (η eta) as a way to improve the Gradient Descent…
Capability of the MLP to learn XOR
In a series of posts, I will study the properties of the Multilayer Perceptron (MLP), starting with the capability to learn some mathematical functions (XOR, y=X², ..). This subject has been studied long time ago by researchers and George Cybenko demonstrated that any function could be approximated by a MLP (see Cybenko, G. 1989. Approximation…