The history of artificial neural networks goes back to the first days of computing. In 1943, Warren McCulloch and Walter Pitts, both mathematicians, built a circuitry system with the intention to approximate the functioning of the human brain that ran simple algorithms. In other words, the artificial neural network (ANN), or just neural network, is a machine learning method evolved from the idea of simulating the human brain. ANN is employed to model complex patterns in datasets using multiple hidden layers and non-linear activation functions.
It wasn't until 2010 that research picked up again. The big data trend, and parallel gave data scientists the training data and computing resources needed to run complex ANNs. The present excitement about ANN can be traced back to 2012 and an online contest called the ImageNet Challenge, when a neural network was able to beat human performance at an image recognition task. Since then, the interest in ANNs has soared and technology keeps improving.
Neural Network Architectures
The ANN architecture consists of 3 types of node (individual unit) layers, namely input, hidden, and output layers. Neurons in the input layer are connected with the nodes in the hidden layer, and the latter are connected to the ones of the output layer. Each connection is assigned a weight. The data taken from the network by input layer are then passed to the hidden layer to be processed. The resulting value is then sent to the output layer, which will also process it and the then output is computed.
Figure 1 - Architecture of an artificial neuron and a multilayered neural network (source)
In the above figure, for one single observation, x1, x2, x3...x(n) represents various inputs (independent variables) to the network. Each of those inputs is multiplied by a connection weight. The weights are represented as w1, w2, w3….w(n). Weight shows the strength of a specific neuron. In a classic case, these products are summed, fed to an activation function (also known as transfer function) to get a result, and this result is sent as output.
Mathematically, x1.w1 + x2.w2 + x3.w3 ...... xn.wn = ∑ xi.wi. Now activation function is applied 𝜙(∑ xi.wi). This function then decides whether a node should be activated or not by calculating the weighted sum and further adding bias to it. The motive is to introduce non-linearity into the output of a node.
The interconnection of the nodes between the layers can be divided into two classes, namely the Feedforward neural network and recurrent neural network.
In Feedforward Neural Network (FNN), data are processed only in one direction. Data travels from input nodes to the output nodes and passing through hidden nodes (if any exist). FNNs do not use feedback loops or cycles and are considered the basic type of neural network. FNNs fall into two categories depending on the number of the layers, either "single layer" or "multi-layer". A FNN, also called a Multilayer Perceptron (MLP), can use linear or non-linear activation functions (Goodfellow et al., 2016). More important, there are no cycles in the neural network that would allow a direct feedback. (Front. Artif. Intell., 28 February 2020, link).
Figure 2 - Two examples for Feedforward Neural Networks. (A) A shallow FNN. (B) A Deep Feedforward Neural Network (D-FNN) with 3 hidden layers. (source)
In Recurrent Neural Network (RNN) architecture, each output unit is connected to itself and is also fully connected to other output units and all hidden units. The feedback loops allow RNNs to model the effects of the earlier parts of the sequence on the later part of the sequence, which is an important feature when it comes to modelling sequences. These looped networks are termed recurrent because they do the same operations and computation for every component in a sequence of input data.
Neural Network Learning
The ANN uses a training algorithm to learn the datasets which adjust/modify the node weight depending on the error rate between target and actual output. Generally, ANN uses the back-propagation algorithm as a training algorithm to learn the datasets. In essence, the algorithm’s backward phase calculates how much each neuron’s weight contribute to the error and then updates those weights to improve the network’s performance. This calculation proceeds sequentially in reverse direction from the output layer to the input layer, for this reason named back-propagation. Do that over and over for sets of inputs and desired outputs, and you’ll eventually reach an acceptable set of weights for the entire neural network.
FNNs and RNNs are the mostly encountered type of artifical neural networks and applied to many diverse fields. In this article, an overview of artificial neural networks have been given. A detailed description of back-propagation algorithm, the mostly used learning/training algorithm have been presented as well. Interested reader is referred to the literatures below for a thorough discussion of the artificial neural networks and the back-propagation algorithm.
Training and Testing Neural Networks - link3
Designing Artificial Neural Networks - link4
Deep learning: investigating deep neural networks hyper-parameters - link5