Neural Networks: A Beginner’s Guide to Understanding the Magic Behind AI

17 Sep 2023

ai neural-networks computer-science

Welcome traveler, in the growing world of intelligence there is a term that always stands out as incredibly influential, “neural networks.” These computerized structures, which draw inspiration from the mechanisms of the brain serve as the foundation, for state of the art technologies that drive various applications, like autonomous vehicles and voice recognition systems.

After a short break, I’m back with another interesting article, where we look deep into how neural networks work. I will try to explain it in a simple way and hopefully, you will find out that in the end it’s not that hard.

I’m not a professional and I could be wrong if you spot any mistakes, head into comments 🙂.

With that said let’s jump into the darkness…

You will learn

Fundamentals of Neural Networks

What Are Neural Networks?
Neurons

Neural Network Layers

Input Layer
Hidden Layers
Output Layer

Common Implementations

Feedforward Neural Networks (FNN)
Convolutional Neural Networks (CNN)
Recurrent Neural Networks (RNN)
Long Short-Term Memory Networks (LSTM)

What Are Neural Networks?

You could think of them like equivalent of the human brain’s information processing system. They are designed to mimic the brain’s ability to learn, adapt, and process data. These networks consist of artificial neurons, or nodes, interconnected by weighted connections, the whole calls it self: Artificial Neural Network (ANN).

Visual representation of “neural network”.

Neural networks are inspired by the workings of the brain. In our brains neurons communicate by transmitting signals through synapses. Artificial neural networks replicate this process using models.

These networks are exceptional, at recognizing patterns making predictions and handling volumes of data, which is crucial for tasks such, as image recognition, language comprehension and recommendation systems.

Neurons

Building Blocks of Neural Networks

In a neural network, each neuron acts as a processing unit.

Artificial neurons take input data, perform calculations on it, and produce an output.

These outputs are then used as inputs for subsequent layers or are the final predictions of the network.

Weights and Biases

Two essential components are weights and biases. These elements determine how strongly an input signal influences the neuron’s output.

Weights💪: Weights is the strength of connections between neurons. Each connection between neurons has an associated weight. Larger weights amplify the input’s impact on the neuron’s output, while smaller weights diminish it.
Biases👤: Biases are like an offset or threshold for the neuron. They are shifting the activation function left or right. If you want to read more about biases you can click here.

Weights and biases enable neurons to learn and adjust their behavior during training. The process of determining the appropriate weights and biases, often known as model training or optimization.

They are capable of learning complicated patterns and correlations among data, making them useful for tasks like as image recognition, natural language interpretation, and many others.

Example of neural prediction

Imagine you are training a neural network to predict whether a student will pass a test based on the number of hours they study. Your input data consists of the number of hours a student studied, and the output is representing whether they passed (1) or failed (0).

To make this prediction, the neuron performs the following operation:

Output = Activation Function(Weight × Number of Hours Studied + Bias)

Activation function⚡️: It determines whether a neuron should “fire” (output a signal) or not based on the input it receives. The most frequent functions are Sigmoid, ReLU (Rectified Linear Unit), Tanh (hyperbolic tangent).

Here the bias can represent how hard is the test, Let’s say the weight is 0.1, and the bias is -5 and you studied for 40hours.

Output = Activation Function(0.1 x 40 x -5) = -1

Since it’s -1 meaning you wouldn’t pass. Of course predicting the test like this is nonce, but you got the point.

Neural Network Layers

Neural networks are made up of layers, each having its own function and characteristics.

Understanding these levels is critical for understanding how information moves through a network and how the decisions are made.

We’ll look at the three main types of layers in a neural network architecture: the Input Layer, Hidden Layers, and Output Layer.

Input Layer

The Entry Point of Data

The first layer of a neural network is the Input Layer. Its major duty is to receive data and forward it to the following layers for processing.

Each neuron in the input layer corresponds to a feature or input variable, it serves as an important link between the external data and the neural network.

No Computation🚧: Neurons in the input layer don’t perform any computation. They simply transmit the input data to the hidden layers.
Size Determination🔍: The number of neurons in the input layer is determined by the dimensionality of the input data. For instance, in an image classification task, each pixel might correspond to a neuron in the input layer.

You may encounter situations where input values are constrained to a specific range [0, 1] that’s called standardization and normalization. It can help the neural network converge faster during training and make it more robust to different input scales. However, these techniques are not always necessary and depend on the type of data.

Hidden Layers

The Workhorses of the Network

In a neural network, the true computing takes place in the Hidden Layers. These layers are in charge of extracting characteristics from input data as well as learning complex patterns and representations.

While a network can have numerous hidden layers, the number and size of hidden layers are determined by the network’s architecture and the task’s complexity.

Weighted Sum➕: The values provided are multiplied by their weights. These weighted sums are computed by taking the dot product, between the input values and the corresponding neuron weights. The outcome is a collection of sums.
Bias Addition➕: After calculating the weighted sums, each neuron adds a bias term.
Activation Function⚡️: The result of the weighted sum plus the bias for each neuron is then passed through an activation function.
Hierarchical Learning🧠: Hidden layers progressively learn more abstract and high-level features as information flows through the network’s layers.

The hidden layers of a neural network act as feature extractors.

Output Layer

Making Predictions

Final layer of the neural network. It produces the network’s predictions or outputs based on the processed information from the hidden layers. The structure and number of neurons in the output layer depend on the nature of the task.

Loss Calculation📉: Once the network produces its output, the loss or error between the predicted values and the actual target values (ground truth) is calculated. This loss is a measure of how well the network is performing.
Backpropagation and Training🔄: The computed loss is used to update the network’s weights and biases through a process called backpropagation and optimization algorithms like gradient descent. The network learns to adjust its parameters to minimize the loss, making its predictions more accurate over time.
Interpretation📖: The values produced by the neurons in the output layer are often interpreted as probabilities or class scores, depending on the problem. The highest-scoring class or classes are the network’s final predictions.

The final predictions or values are created in the output layer based on the network’s learnt weights and biases. Network’s performance is enhanced by training, in which it learns to make better predictions by altering its parameters.

Common Implementations

Neural networks come in a variety of shapes and sizes, each one customized to specific types of data and activities. I’ll introduce you to four common neural network implementations and share insights on where they thrive.

Feedforward Neural Networks (FNN)

The Foundation of Neural Networks

Feedforward Neural Networks, often referred to as multilayer perceptrons (MLPs), serve as the foundational architecture for neural networks. They are versatile and can be used for a wide range of tasks, including classification and regression.

Architecture🏰: FNNs consist of an input layer, one or more hidden layers, and an output layer. Information flows in one direction, from input to output, without loops or cycles.
Applications📋: FNNs find applications in image classification, sentiment analysis, and other structured data tasks. Their simplicity and effectiveness make them a valuable choice for many problems.

Convolutional Neural Networks (CNN)

Unraveling Image and Spatial Data

Convolutional Neural Networks are specifically designed for tasks involving grid-like data, such as images and spatial data. They excel at capturing spatial hierarchies and patterns.

Architecture🏰: CNNs incorporate convolutional layers for feature extraction and pooling layers for dimensionality reduction. These layers allow the network to understand spatial relationships in the data.
Applications📋: CNNs are the go-to choice for image classification, object detection, facial recognition, and tasks involving grid-like data.

Recurrent Neural Networks (RNN)

Mastering Sequential Data

Recurrent Neural Networks are designed for tasks that involve sequential data, where the order of elements matters. They have memory cells that enable them to retain information from previous time steps.

Architecture🏰: RNNs have recurrent connections that loop back on themselves, allowing them to maintain a form of memory. This makes them suitable for tasks like natural language processing, time series prediction, and speech recognition.
Applications📋: RNNs are used in machine translation, text generation, speech synthesis, and any task involving sequences.

Long Short-Term Memory Networks (LSTM)

Overcoming the Shortcomings of RNNs

While RNNs are powerful for sequential data, they suffer from vanishing gradient problems. Long Short-Term Memory Networks (LSTMs) were developed to address these issues by introducing specialized memory cells.

Architecture🏰: LSTMs include memory cells with gating mechanisms that control the flow of information, allowing them to capture long-range dependencies in sequential data.
Applications📋: LSTMs are widely used in tasks requiring long-term dependencies, such as speech recognition, language modeling, and sentiment analysis.

These are just a few of the common neural network architectures, each tailored to different data types and tasks. As we explore these implementations further, you’ll gain a deeper understanding of how to choose the right architecture for your specific AI and machine learning projects.

Conclusion

They can be used almost everywhere and they are ground breaking because they can remove the repetitive tasks from daily life, but at their core they cannot replace humans.. So, no worries 😊.

Anyway, If you enjoyed this article, clap👋 and follow me📑! Thanks for reading! Looking forward to seeing you in the future.