What is a neural network?
It can be defined as the system of hardware and software patterned after the operation of neurons in the human brain. ANNs is also called neural network.
Commercial applications of these technologies will be a focus on solving complex signal processing or pattern recognition problems. Examples of significant commercial applications since 2000 include handwriting recognition for check processing, speech-to-text transcription, oil-exploration data analysis, weather prediction, and facial recognition.
The history of artificial neural networks goes back to the early stage of computing. In 1943, mathematicians Warren McCulloch and Walter Pitts built a circuitry system intended to approximate the functioning of the human brain that ran simple algorithms.
It wasn’t until around 2010 that research picked up again. The big data trend, where companies amass vast troves of data, and parallel computing gave data scientists the training data and computing resources needed to run complex artificial neural networks. In 2012, a neural network was able to beat human performance at an image recognition task as part of the ImageNet competition. Since then, interest in artificial neural networks as has soared and the technology continues to improve.
Artificial neural network
ANN stands for Artificial Neural Networks. Basically, it’s a computational model. That is based on the structures and functions of biological neural networks. Although, the structure of the ANN is affected by a flow of information. Hence, neural network changes were based on input and output.
How does ANN work?
Artificial Neural Networks can be best viewed as weighted directed graphs, where the nodes are formed by the artificial neurons, and the connection between the neuron outputs and neuron inputs can be represented by the directed edges with weights. The Artificial Neural Network receives the input signal from the external world in the form of a pattern and image in the form of a vector. These inputs are then mathematically designated by the notations x(n) for every n number of inputs.
Each of the inputs is then multiplied by its corresponding weights (these weights are the details used by the artificial neural networks to solve a certain problem). In general terms, these weights typically represent the strength of the interconnection amongst neurons inside the artificial neural network. All the weighted inputs are summed up inside the computing unit.
If the weighted sum equates to zero, a bias is added to make the output non-zero or else to scale up to the system’s response. Bias has the weight and the input to it is always equal to 1. Here the sum of weighted inputs can be in the range of 0 to positive infinity. To keep the response in the limits of the desired value, a certain threshold value is benchmarked. And then the sum of weighted inputs is passed through the activation function.
The activation function, in general, is the set of transfer functions used to get the desired output of it. There are various flavors of the activation function, but mainly either linear or non-linear sets of functions. Some of the most commonly used sets of activation functions are the Binary, Sigmoidal, and Tan hyperbolic sigmoidal activation functions.
Types of ANN
The perceptron model, proposed by Minsky-Papert is one of the simplest and oldest models of Neuron. The smallest unit of the neural network does certain computations to detect features or business intelligence in the input data. It accepts weighted inputs and applies the activation function to obtain the output as the final result. Perceptron is also known as TLU
Perceptron is a supervised learning algorithm that classifies the data into two categories, thus it is a binary classifier. A perceptron separates the input space into two categories by a hyperplane represented by the following equation
2) Feed Forward Neural Networks
The simplest form of neural networks is where input data travels in one direction only, passing through artificial neural nodes and exiting through output nodes. Where hidden layers may or may not be present, input and output layers are present there. Based on this, they can be further classified as a single-layered or multi-layered feed-forward neural network.
The number of layers depends on the complexity of the function. It has uni-directional forward propagation but no backward propagation. Weights are static here. An activation function is fed by inputs which are multiplied by weights. To do so, a classifying activation function or step activation function is used. For example, The neuron is activated if it is above the threshold (usually 0) and the neuron produces 1 as an output. The neuron is not activated if it is below the threshold (usually 0) which is considered as -1. They are fairly simple to maintain and are equipped to deal with data that contains a lot of noise.
3) Multilayer Perceptron
An entry point towards complex neural nets where input data travels through various layers of artificial neurons. Every single node is connected to all neurons in the next layer which makes it a fully connected neural network. Input and output layers are present having multiple hidden Layers i.e. at least three or more layers in total. It has a bi-directional propagation i.e. forward propagation and backward propagation.
Inputs are multiplied with weights and fed to the activation function and in backpropagation, they are modified to reduce the loss. In simple words, weights are machine-learned values from Neural Networks. They self-adjust depending on the difference between predicted outputs vs training inputs. Nonlinear activation functions are used followed by softmax as an output layer activation function.
4) Convolutional Neural Network
A convolution neural network contains a three-dimensional arrangement of neurons, instead of the standard two-dimensional array. The first layer is called a convolutional layer. Each neuron in the convolutional layer only processes the information from a small part of the visual field. Input features are taken batch-wise like a filter. The network understands the images in parts and can compute these operations multiple times to complete the full image processing. Processing involves the conversion of the image from RGB or HSI scale to grey-scale. Furthering the changes in the pixel value will help to detect the edges and images can be classified into different categories.
Propagation is uni-directional where CNN contains one or more convolutional layers followed by pooling and bidirectional where the output of the convolution layer goes to a fully connected neural network for classifying the images as shown in the above diagram. Filters are used to extract certain parts of the image. In MLP the inputs are multiplied with weights and fed to the activation function. Convolution uses RELU and MLP uses a nonlinear activation function followed by softmax. Convolution neural networks show very effective results in image and video recognition, semantic parsing, and paraphrase detection.
5)Radial Basis Function Neural Networks
Radial Basis Function Network consists of an input vector followed by a layer of RBF neurons and an output layer with one node per category. Classification is performed by measuring the input’s similarity to data points from the training set where each neuron stores a prototype. This will be one of the examples from the training set.
When a new input vector [the n-dimensional vector that you are trying to classify] needs to be classified, each neuron calculates the Euclidean distance between the input and its prototype. For example, if we have two classes i.e. class A and Class B, then the new input to be classified is more close to class A prototypes than the class B prototypes.
6) Recurrent Neural Networks
Designed to save the output of a layer, Recurrent Neural Network is fed back to the input to help in predicting the outcome of the layer. The first layer is typically a feed-forward neural network followed by a recurrent neural network layer where some information it had in the previous time-step is remembered by a memory function. Forward propagation is implemented in this case. It stores information required for its future use. If the prediction is wrong, the learning rate is employed to make small changes. Hence, making it gradually increase towards making the right prediction during the backpropagation.