Objective: Application of Multi-Layer Perceptron on classification Problem
Theory: Multi-layer perceptron (MLP) is a supplement of a feed-forward neural network. It consists of three types of layers—the input layer, output layer, and hidden layer, as shown in Fig. below. The input layer receives the input signal to be processed. The required task such as prediction and classification is performed by the output layer. An arbitrary number of hidden layers that are placed in between the input and output layer are the true computational engine of the MLP. Similar to a feed-forward network in an MLP the data flows in the forward direction from input to output layer. The neurons in the MLP are trained with the backpropagation learning algorithm. MLPs are designed to approximate any continuous function and can solve problems that are not linearly separable. The major use cases of MLP are pattern classification, recognition, prediction,
and approximation.
The computations taking place at every neuron in the output and hidden layer are as follows,
o(x)=G(b(2)+W(2)h(x)) …(1)
h(x)=Φ(x)=s(b(1)+W(1)x) …(2)
with bias vectors b(1), b(2); weight matrices W(1), W(2) and activation functions G and s. The set of parameters to learn is the set θ = {W(1), b(1), W(2), b(2)}. Typical
choices for s include tanh function with tanh(a) = (ea − e− a)/(ea + e− a) or the logistic sigmoid function, with sigmoid(a) = 1/(1 + e− a)
Perceptron for Binary Classification
With this discrete output, controlled by the activation function, the perceptron can be used as a binary classification model, defining a linear decision boundary. It finds the separating hyperplane that minimizes the distance between misclassified points and the decision boundary
To minimize this distance, Perceptron uses Stochastic Gradient Descent as the optimization function.
If the data is linearly separable, it is guaranteed that Stochastic Gradient Descent will converge in a finite number of steps.
The last piece that Perceptron needs is the activation function, the function that determines if the neuron will fire or not. Initial Perceptron models used sigmoid function, and just by looking at its shape, it makes a lot of sense! The sigmoid function maps any real input to a value that is either 0 or 1 and encodes a
non-linear function. The neuron can receive negative numbers as input, and it will still be able to produce an output that is either 0 or 1.
A Multilayer Perceptron has input and output layers, and one or more hidden layers with many neurons stacked together. And while in the Perceptron the neuron must have an activation function that imposes a threshold, like ReLU or sigmoid, neurons in a Multilayer Perceptron can use any arbitrary activation function.
Conclusion
Perceptron is a neural network with only one neuron, and can only understand linear relationships between the input and output data provided.
However, with Multilayer Perceptron, horizons are expanded and now this neural network can have many layers of neurons.
Comments
Post a Comment