Backpropagation Algorithm: Overview & How it works - 安定传媒
  • Articles
  • Tutorials
  • Interview Questions
  • Webinars

Backpropagation Algorithm in Neural Network

artificial neural network, the values of weights and biases are randomly initialized. Due to random initialization, the neural network probably has errors in giving the correct output. We need to reduce error values as much as possible. So, to reduce these error values, we need a mechanism that can compare the desired output of the neural network with the network’s output that consists of errors and adjust its weights and biases such that it gets closer to the desired output after each iteration. For this, we train the network such that it back propagates and updates the weights and biases. This is the concept of the back propagation algorithm.

Watch this video on Tensorflow Tutorial for Beginners:

Video Thumbnail

The actions an artificial neural network takes to achieve maximum accuracy and reduce error values are listed below:

Understanding Deep Learning

  1. Parameter Initialization
  2. Feedforward Propagation
  3. Backpropagation

AI1

We will look into all these steps, but mainly we will focus on the back propagation algorithm.

Parameter Initialization

In this case, parameters, i.e., weights and biases, associated with an artificial neuron are randomly initialized. After receiving the input, the network feeds forward the input and makes associations with weights and biases to give the output. The output associated with those random values is most likely not correct. So, next, we will see feedforward propagation. Want to become a master in Artificial Intelligence, check out this Artificial Intelligence Course!

Feedforward propagation

After initialization, when the input is given to the input layer, it propagates the input into hidden units at each layer. The nodes here do their job without being aware of whether the results produced are accurate or not (i.e., they don’t re-adjust according to the results produced). Then, finally, the output is produced at the output layer. This is called feedforward propagation.

Back propagation in Neural Networks

The principle behind the back propagation algorithm is to reduce the error values in randomly allocated weights and biases such that it produces the correct output. The system is trained in the supervised learning method, where the error between the system’s output and a known expected output is presented to the system and used to modify its internal state. We need to update the weights so that we get the global loss minimum. This is how back propagation in neural networks works.

AI2When the gradient is negative, an increase in weight decreases the error.

When the gradient is positive, the decrease in weight decreases the error.

Get 100% Hike!

Master Most in Demand Skills Now!

Working of Back Propagation Algorithm

How does back propagation algorithm work?

The goal of the back propagation algorithm is to optimize the weights so that the neural network can learn how to correctly map arbitrary inputs to outputs. Here, we will understand the complete scenario of back propagation in neural networks with the help of a single training set.

In order to have some numbers to work with, here are initial weights, biases, and training input and output.

Inputs(i1): 0.05                          Output (o1): 0.01

Inputs(i2): 0.10                          Output(o2):0.99

Step 1: The Forward Pass:

AI3

The total net input for h1: The net input for h1 (the next layer) is calculated as the sum of the product of each weight value and the corresponding input value and, finally, a bias value added to it.

AICode27

 

The output for h1: The output for h1 is calculated by applying a sigmoid function to the net input Of h1.

The sigmoid function pumps the values for which it is used in the range of 0 to 1.

It is used for models where we have to predict the probability. Since the probability of any event lies between 0 and 1, the sigmoid function is the right choice.

AICode28

 

Carrying out the same process for h2:

out h2 = 0.596884378

The output for o1 is:

AICode29

 

Carrying out the same process for o2:

out o2 = 0.772928465

Calculating the Total Error:

We can now calculate the error for each output neuron using the squared error function and sum them up to get the total error: E total = Ʃ1/2(target – output)2

The target output for o1 is 0.01, but the neural network output is 0.75136507; therefore, its error is:

E o1 = 1/2(target o1 - out o1)2 = 1/2(0.01 - 0.75136507)2 = 0.27481108 ……………..……………. (Equation 5)

By repeating this process for o2 (remembering that the target is 0.99), we get:

E o2 = 0.023560026

Then, the total error for the neural network is the sum of these errors:

E total = E o1 + E o2 = 0.274811083 + 0.023560026 = 0.298371109

Step 2: Backward Propagation:

Our goal with the backward propagation algorithm is to update each weight in the network so that the actual output is closer to the target output, thereby minimizing the error for each neuron and the network as a whole.

AI7

Consider w5; we will calculate the rate of change of error w.r.t the change in the weight w5:

AICode5

Since we are propagating backward, the first thing we need to calculate the change in total errors w.r.t the outputs o1 and o2:

Now, we will propagate further backward and calculate the change in the output o1 w.r.t to its total net input:

AICode30

How much does the total net input of o1 change w.r.t w5?

AICode31

Certification in Bigdata Analytics

Putting all values together and calculating the updated weight value:

AICode8

Let’s calculate the updated value of w5.

AICode9

We can repeat this process to get the new weights w6, w7, and w8.

AICode10

We perform the actual updates in the neural network after we have the new weights leading into the hidden layer neurons.

We’ll continue the backward pass by calculating new values for w1, w2, w3, and w4:

Starting with w1:

AICode12

We’re going to use a similar process as we did for the output layer, but slightly different to account for the fact that the output of each hidden layer neuron contributes to the final output. Thus, we need to take Eo1 and Eo2 into consideration.

We can visualize it as follows:

AI4

Starting with h1:

AICode19

We can calculate:

AICode20
We will calculate the partial derivative of the total net input of h1 w.r.t w1 the same way as we did for the output neuron.

AICode23

Let’s put it all together.

AICode21

  • When we originally fed forward 0.05 and 0.1 inputs, the error on the network was 0.298371109.
  • After the first round of backpropagation, the total error is now down to 0.291027924.

Become a Big Data Architect

It might not seem like much, but after repeating this process 10,000 times, for example, the error plummets to 0.0000351085. At this point, when we feedforward 0.05 and 0.1, the two output neurons will generate 0.015912196 (vs. 0.01 target) and 0.984065734 (vs. 0.99 target).

Now, in this back propagation algorithm blog, let’s go ahead and comprehensively understand “Gradient Descent” optimization.

Understanding Gradient Descent

  • Gradient descent is by far the most popular optimization strategy used in Machine Learning and Deep Learning at the moment. It is used while training our model, can be combined with every algorithm, and is easy to understand and implement.
  • Gradient measures how much the output of a function changes if we change the inputs a little.
  • We can also think of a gradient as the slope of a function. The higher the gradient, the steeper the slope, and the faster the model learns.
  • AICode32

where,

b = next value

a = current value

‘−’ refers to the minimization part of the gradient descent.

AICode26

  • This formula basically tells us the next position where we need to go, which is the direction of the steepest descent.
  • Gradient descent can be thought of as climbing down to the bottom of a valley instead of up a hill. This is because it is a minimization algorithm that minimizes a given function.
  • Let’s consider the graph below, where we need to find the values of w and b that correspond to the minimum cost function (marked with a red arrow).

AI5

  • To start with finding the right values, we initialize the values of w and b with some random numbers, and gradient descent starts at that point (somewhere around the top).
  • Since we can’t pass the entire dataset into the neural net at once, we divide the dataset into several batches, sets, or parts.

Batches

  • The total number of training examples present in a single batch is referred to as the batch size.
  • Since we can’t pass the entire dataset into the neural net at once, we divide the dataset into several batches, sets, or parts.

Moving ahead in this blog on “Back Propagation Algorithm”, we will look at the types of gradient descent.

Types of Gradient Descent

AI5

Batch Gradient Descent

In batch gradient descent, we use the complete dataset available to compute the gradient of the cost function. Batch gradient descent is very slow because we need to calculate the gradient on the complete dataset to perform just one update, and if the dataset is large, then it will be a difficult task.

  1. The Cost function is calculated after the initialization of parameters.
  2. It reads all the records into memory from the disc.
  3. After calculating sigma for one iteration, we move one step further and repeat the process.

Mini-batch Gradient Descent

It is a widely used algorithm that produces faster and more accurate results. The dataset here is clustered into small groups of ‘n’ training datasets. It is faster because it does not use the complete dataset. In every iteration, we use a batch of ‘n’ training datasets to compute the gradient of the cost function. It reduces the variance of the parameter updates, which can lead to more stable convergence. It can also make use of a highly optimized matrix that makes computing gradients very efficient.

Stochastic Gradient Descent

We use stochastic gradient descent for faster computation. The first step is to randomize the complete dataset. Then, we use only one training example in every iteration to calculate the gradient of the cost function for updating every parameter. It is faster for larger datasets also because it uses only one training example in each iteration.

We understood all the basic concepts and workings of back propagation algorithms through this blog. Now, we know that the back propagation algorithm is the heart of a neural network.

Watch this Artificial Intelligence Project

Video Thumbnail

We hope this tutorial helps you gain knowledge of AI Course. If you are looking to learn Artificial Intelligence course with placements in a systematic manner with expert guidance and support then you can enroll to our Artificial Intelligence Online Course.

About the Author

Principal Data Scientist

Meet Akash, a Principal Data Scientist with expertise in advanced analytics, machine learning, and AI-driven solutions. With a master’s degree from IIT Kanpur, Aakash combines technical knowledge with industry insights to deliver impactful, scalable models for complex business challenges.

相关内容推荐

海南期刊平台文史探源期刊职称评审期刊项城金秋期刊国际顶尖数学学术期刊京派代表期刊天边期刊目录内期刊与目录外期刊pmc收录期刊ccc期刊材料BC期刊全称oa期刊排名养猪 期刊cd期刊采矿sci期刊OA期刊是正规学术期刊吗创办期刊程序版面费低的省级期刊水利水电类行业期刊TTST期刊防疫宣传期刊ljmmm期刊新建筑期刊是核心期刊吗著名英文期刊物流市场期刊免费期刊论文下载网站耕耘期刊电子制作期刊怎么样管理科学期刊排名中国期刊论文网国家级期刊期刊分值org期刊全名精神核心期刊pci核心期刊胡杨期刊logo化工类期刊版面费测绘领域期刊2017-2018SSCI期刊目录science期刊几区见刊快的省级医学期刊期刊期刊论文会对附件查重职业教育类核心期刊骨科护理期刊是核心期刊吗省级期刊和国家期刊的界定期刊三网江苏作家期刊期刊论文和学术论文厦门医药期刊景观sci期刊电子期刊app开源期刊电气化学工程师期刊官网期刊选择助手案例分析期刊环境ccr期刊真情期刊改名热能工程期刊电子控制期刊物业人物期刊期刊的体裁论文期刊ice景观方面期刊政府期刊整治npg 系列期刊旅游期刊中最好中的期刊电池carbon期刊拓扑材料期刊审计外文期刊sci中医期刊期刊跟报刊万方期刊论文库期刊cover date计算机期刊会议等级花城 期刊声誉较差的印度sci期刊期刊gbank一类学术期刊目录电气应用是核心期刊吗嘉际文化传媒期刊网期刊源ijerph期刊方向普通期刊升核心期刊要多久D级期刊和F级期刊哪个好电子期刊跟学术期刊区域供热 期刊媒介期刊abs星级期刊综合性期刊改学术性期刊计算机仿真期刊审稿费消毒 英文期刊图书馆类省级期刊水下考古期刊教育类核心期刊发表新建筑期刊是核心期刊吗期刊二八法则小鲜肉期刊期刊neuroimage期刊怎样编目冬奥期刊征稿物理学最好的期刊global 期刊水上消防期刊能源经济类期刊合世界 期刊or 期刊jsac期刊几区品牌研究 期刊中文社科期刊推送期刊邮件国家级期刊好发吗nanozyme相关期刊期刊期卷GB格式期刊山西建筑期刊是水刊吗金杜中国法律期刊红学研究期刊宸陌期刊美国茶叶期刊天津市政工程期刊期刊优化 决算微生物方向的sci期刊from tears期刊meta 核心期刊外科cscd期刊央音期刊宝能期刊内蒙古林业科技期刊中国法学期刊格式要求电力系统核心期刊排名安徽省二类期刊矿冶期刊速度国家护理期刊万方期刊网投稿真假中国时政期刊期刊附录查找麻醉投稿期刊商学研究期刊英文期刊的期刊号怎么查财经论文发表期刊新闻世界期刊水位预测期刊期刊催告无效江苏省科技期刊协会理性讨论期刊福建新闻期刊jbhi期刊咋样计算机专业顶级期刊经济管理是核心期刊吗期刊重点专栏纺织检验期刊非常好发的普通期刊中国翻译是核心期刊吗汽车装备期刊期刊名谜语生物化学顶级期刊论文代发期刊陕西医学期刊目录期刊资料azonano期刊英文期刊的期刊号在哪里何雪松期刊自动化期刊杂志国际食品期刊期刊设计模板汽车与配件期刊官网湖南省期刊 有哪些电子期刊意见攀登 期刊童话课堂期刊体育学科期刊国际期刊ABCDEF杭州家电期刊信息安全方面的期刊natural期刊全名支点期刊投稿实验力学期刊怎么样期刊的citationjames 期刊岩土领域内几个sci期刊控制与决策期刊怎么样论文期刊字号isc期刊卓越期刊与一区期刊哪个好财讯期刊怎么样小报童期刊网球期刊方向工程管理 期刊system 期刊重庆诚信期刊期刊岁末盘点中国艺术类核心期刊期刊论文修改说明mta 期刊jcem期刊OEA期刊全称体育世界 期刊iscsi 期刊公安论文期刊体育外文期刊广电期刊和普通期刊

合作伙伴

安定传媒

seo.urkeji.com
seo.urkeji.com
www.07yue.com
www.urkeji.com
www.imcrd.com
zz1.urkeji.com
www.3phw.com
zz1.urkeji.com
www.conductive-powder.com
dw.urkeji.com
www.snlanyards.com
www.akz.net.cn
seo.chaoshanxing.com
jl.urkeji.com
www.haowangjiao.cc
www.bjdongwei.cn
www.3phw.com
www.innatjerome.com
www.andmedia.cn
www.xm5656.cn