反向传播

最新推荐文章于 2026-01-31 10:11:27 发布

原创最新推荐文章于 2026-01-31 10:11:27 发布 · 150 阅读

0 ·

本内容遵循CC 4.0 BY-SA版权协议

收录于

机器学习基础专栏收录该内容

14 篇文章

订阅专栏

chain rule used in a single neuron:
绿箭头（used to calculate z）：forward pass
红箭头（used to calculate gradients of weight matrices)：backward pass
在这里插入图片描述
network architecture:

z = w1x
h = sigmoid(z)
y^ = w2h
E(loss) = 1/2||y^ - y||2

在这里插入图片描述
step1：
loss function 对 hidden layer-output layer weight matrix 的导数矩阵（the same size as the original weight matrix W2)：

step2：
loss对h和对z的导数矩阵：

step3：
loss function 对 input layer-hidden layer weight matrix 的导数矩阵（the same size as the original weight matrix W1)：
在这里插入图片描述
Properties we use in the derivation：

reference:

[1] https://www.bilibili.com/video/BV1h4411A7v4/?spm_id_from=333.788.videocard.3
[2] http://web.stanford.edu/class/cs224n/readings/cs224n-2019-notes03