学习机器学习和神经网络的时候,微积分求导是一项基本功。当然你不必掌握所有公式的导数,只要记住机器学习常见的导数即可。其实你不需要知道推导的过程,只需要记住结果便可以进行开发了,但是对于我来说,更习惯去推导过程,下面是我对常用的两个激活函数:sigmod函数和tanh函数求导的过程的推算,分享给有需要的朋友
1.一元函数求导的链式法则
在计算高级复合函数的导数前,我们首先要知道导数的链式法则。
- 给定函数 :
u = b c v = a + u J = 3 v u=bc \\ v=a+u \\ J=3v u=bcv=a+uJ=3v
求 d J d u \frac{dJ}{du} dudJ - 这是一个非常典型的一元复合函数求偏导的例子,推导的过程不难,取值带入推导,这里不着重介绍了,相信有数学基础的小伙伴能够轻松推理出来
- 结论:
d J d u = d J d v × d v d u \frac{dJ}{du}=\frac{dJ}{dv}\times\frac{dv}{du} dudJ=dvdJ×dudv
d J d b = d J d v × d v d u × d u d b \frac{dJ}{db}=\frac{dJ}{dv}\times\frac{dv}{du}\times\frac{du}{db} dbdJ=dvdJ×dudv×dbdu
2.sigmod函数求导过程
- sigmod函数:
f ( x ) = 1 1 + e − x f(x)=\frac{1}{1+e^{-x}} f(x)=1+e−x1
我们对该函数稍作改造,使其易于求导:
f ( x ) = 1 + e − x − e − x 1 + e − x f ( x ) = 1 − e − x 1 + e − x f ( x ) = 1 − 1 1 + e x f ( x ) = 1 − ( 1 + e x ) − 1 \begin{aligned} f(x)&=\frac{1+e^{-x}-e^{-x}}{1+e^{-x}} \\ f(x)&= 1-\frac{e^{-x}}{1+e^{-x}} \\ f(x)&=1-\frac{1}{1+e^{x}}\\ f(x)&=1-({1+e^{x}})^{-1} \end{aligned} f(x)f(x)f(x)f(x)=1+e−x1+e−x−e−x=1−1+e−xe−x=1−1+ex1=1−(1+ex)−1
令:
z = 1 + e x 则: f ( x ) = 1 − z − 1 z={1+e^{x}}\\ 则:\\ f(x)=1-z^{-1} z=1+ex则:f(x)=1−z−1
由链式法则得:
d f d x = d f d z ⋅ d z d x f ′ ( x ) = ( − 1 ) × ( − 1 ) × z − 2 × d z d x f ′ ( x ) = ( 1 + e x ) − 2 × e x f ′ ( x ) = ( 1 + e x ) − 1 × ( 1 + e x ) − 1 × e x f ( x ) = e x 1 + e x ⋅ ( 1 + e x ) − 1 f ( x ) = 1 1 + e − x ⋅ 1 1 + e x f ( x ) = 1 1 + e − x ⋅ ( e x + 1 − e x 1 + e x ) f ( x ) = 1 1 + e − x ⋅ ( 1 − 1 1 + e − x ) ∵ f ( x ) = 1 1 + e − x ∴ f ′ ( x ) = f ( x ) ⋅ ( 1 − f ( x ) ) \begin{aligned} \frac{df}{dx}&=\frac{df}{dz}·\frac{dz}{dx}\\ f'(x)&=(-1)\times(-1)\times{z}^{-2}\times\frac{dz}{dx}\\ f'(x)&=({1+e^x})^{-2}\times e^x\\ f'(x)&=({1+e^x})^{-1}\times ({1+e^x})^{-1}\times e^x \\ f(x)&=\frac{e^x}{1+e^x}\cdot(1+e^x)^{-1}\\ f(x)&=\frac{1}{1+e^{-x}}\cdot\frac{1}{1+e^x}\\ f(x)&=\frac{1}{1+e^{-x}}\cdot(\frac{e^x+1-e^x}{1+e^{x}})\\ f(x)&=\frac{1}{1+e^{-x}}\cdot(1-\frac{1}{1+e^{-x}})\\ \because f(x)&=\frac{1}{1+e^{-x}}\\ \therefore f'(x)&=f(x)\cdot(1-f(x)) \end{aligned} dxdff′(x)f′(x)f′(x)f(x)f(x)f(x)f(x)∵f(x)∴f′(x)=dzdf⋅dxdz=(−1)×(−1)×z−2×dxdz=(1+ex)−2×ex=(1+ex)−1×(1+ex)−1×ex=1+exex⋅(1+ex)−1=1+e−x1⋅1+ex1=1+e−x1⋅(1+exex+1−ex)=1+e−x1⋅(1−1+e−x1)=1+e−x1=f(x)⋅(1−f(x))
3.tanh函数求导过程
- tanh函数:
g ( x ) = e x − e − x e x + e − x g(x)=\frac{e^x-e^{-x}}{e^x+e^{-x}} g(x)=ex+e−xex−e−x
同样对函数进行变形,使其易于求导:
g ( x ) = e x − e − x e x + e − x g ( x ) = e x + e − x − 2 e − x e x + e − x g ( x ) = 1 − 2 e − x e x + e − x g ( x ) = 1 − 2 e 2 x + 1 g ( x ) = 1 − 2 ⋅ ( e 2 x + 1 ) − 1 \begin{aligned} g(x)&=\frac{e^x-e^{-x}}{e^x+e^{-x}} \\ g(x)&=\frac{e^x+e^{-x}-2e^{-x}}{e^x+e^{-x}} \\ g(x)&=1-\frac{2e^{-x}}{e^x+e^{-x}} \\ g(x)&=1-\frac{2}{e^{2x}+1} \\ g(x)&=1-{2}\cdot({e^{2x}+1})^{-1} \\ \end{aligned} g(x)g(x)g(x)g(x)g(x)=ex+e−xex−e−x=ex+e−xex+e−x−2e−x=1−ex+e−x2e−x=1−e2x+12=1−2⋅(e2x+1)−1
同样令:
z = e 2 x + 1 g ( x ) = 1 − 2 z − 1 ∵ d g d x = d g d z ⋅ d z d x ∴ g ′ ( x ) = ( − 2 ) × ( − 1 ) × ( e 2 x + 1 ) − 2 × 2 e 2 x g ′ ( x ) = 4 e 2 x ( e 2 x + 1 ) 2 g ′ ( x ) = ( e 2 x + 1 ) 2 − e 4 x − 1 + 2 e 2 x ( e 2 x + 1 ) 2 g ′ ( x ) = ( e 2 x + 1 ) 2 − ( e 2 x − 1 ) 2 ( e 2 x + 1 ) 2 g ′ ( x ) = 1 − ( e 2 x − 1 ) 2 ( e 2 x + 1 ) 2 g ′ ( x ) = 1 − ( e 2 x − 1 e 2 x + 1 ) 2 括号内分数上下同时除以 e x g ′ ( x ) = 1 − ( e x − e − x e x + e − x ) 2 ∵ g ( x ) = e x − e − x e x + e − x ∴ g ′ ( x ) = 1 − g ( x ) 2 \begin{aligned} z&=e^{2x}+1 \\ g(x)&=1-2z^{-1} \\ \because \frac{dg}{dx}&=\frac{dg}{dz}\cdot \frac{dz}{dx} \\ \therefore g'(x)&=(-2)\times(-1)\times(e^{2x}+1)^{-2}\times2e^{2x}\\ g'(x)&=\frac{4e^{2x}}{(e^{2x}+1)^2} \\ g'(x)&=\frac{(e^{2x}+1)^2-e^{4x}-1+2e^{2x}}{(e^{2x}+1)^2} \\ g'(x)&=\frac{(e^{2x}+1)^2-(e^{2x}-1)^2}{(e^{2x}+1)^2} \\ g'(x)&=1- \frac{(e^{2x}-1)^2}{(e^{2x}+1)^2} \\ g'(x)&=1- (\frac{e^{2x}-1}{e^{2x}+1})^2 \\ 括号内分数上下同时除以e^x \\ g'(x)&=1- (\frac{e^{x}-e^{-x}}{e^{x}+e^{-x}})^2 \\ \because g(x)=&\frac{e^x-e^{-x}}{e^x+e^{-x}} \\ \therefore g'(x)&=1-g(x)^2 \end{aligned} zg(x)∵dxdg∴g′(x)g′(x)g′(x)g′(x)g′(x)g′(x)括号内分数上下同时除以exg′(x)∵g(x)=∴g′(x)=e2x+1=1−2z−1=dzdg⋅dxdz=(−2)×(−1)×(e2x+1)−2×2e2x=(e2x+1)24e2x=(e2x+1)2(e2x+1)2−e4x−1+2e2x=(e2x+1)2(e2x+1)2−(e2x−1)2=1−(e2x+1)2(e2x−1)2=1−(e2x+1e2x−1)2=1−(ex+e−xex−e−x)2ex+e−xex−e−x=1−g(x)2
4.总结
-
sigmod函数导数: f ′ ( x ) = f ( x ) ⋅ ( 1 − f ( x ) ) f'(x)=f(x)\cdot(1-f(x)) f′(x)=f(x)⋅(1−f(x))
-
tanh函数导数: g ′ ( x ) = 1 − g ( x ) 2 g'(x)=1-g(x)^2 g′(x)=1−g(x)2
觉得有用的小伙伴们帮忙点个赞吧
本文详细介绍了sigmoid和tanh函数在机器学习中的求导过程,通过链式法则推导出sigmoid导数为f(x)*(1-f(x)),tanh导数为1-g(x)^2,帮助读者理解和掌握这两个常用激活函数的导数计算方法。

6592

被折叠的 条评论
为什么被折叠?



