python 神经网络拟合二输入函数

最新推荐文章于 2024-11-30 08:44:36 发布

原创最新推荐文章于 2024-11-30 08:44:36 发布 · 4.9k 阅读

79 ·

本内容遵循CC 4.0 BY-SA版权协议

标签

#python #神经网络 #机器学习

研究生课程专栏收录该内容

13 篇文章

订阅专栏

本文详细介绍了如何使用Python构建神经网络，从单输入SISO模型训练sin(x)+0.01e^x，扩展到多输入处理二元函数。通过逐步调整网络结构和参数，优化拟合效果，最终成功拟合目标函数。

问题

自选二元非线性函数，例如 $z = f (x, y)$ ,构建神经网络并对其进行训练，使其在定义域内对此二元函数进行拟合。
\section{问题分析}
选定二元非线性函数为：
$z= \sin x+0.01e^y$ $x∈[0,6],y∈[0,6]x\in [0,6],y\in [0,6]$

本题是神经网络类题目，虽然 $M A T L A B$ 有专为神经网络设计的库和工具箱，也有很多前辈在使用，但是大多是由于前期 $p y t h o n$ 的库还不健全所以使用的 $M A T L A B$ 。近年来随着 $p y t h o n$ 的发展，越来越多的库被添加进来， $p y t h o n$ 的生态已经非常完备。加之 $p y t h o n$ 开源，无需担心可能由政治原因带来的 $M A T L A B$ 版权问题。所以本次作业尝试选择使用 $p y t h o n$ 完成。

整个作业完成的流程是：

提出问题、问题分析、基础学习、问题再次分析、解决问题。

基础学习

先从基础的单输入单输出 $(S I S O)$ 神经网络来解决，若基础问题解决了再深入研究多输入神经网络，继而解决题目中的非线性函数拟合问题。

为了获得较强的非线性特征，选取了函数 $y=\sin x+0.01e^x$ 进行拟合。函数图像如图
在这里插入图片描述

神经网络设置

由题意，神经网络的输入层设置一个单输入、5输出的神经元，输入与输出相等，即权重为1，阈值为0，使用 $s i g n$ 函数进行激发。中间层设置5个神经元，网络构成如图\ref{单输入拓扑}所示的结构。由图示可以分析出中间层神经元输入有5个权重，即 $w_{11}^{(1)},w_{12}^{(1)},w_{13}^{(1)},w_{14}^{(1)},w_{15}^{(1)}$ 。阈值定义为 $B 1$ ，是一个数组，维数为5。输出层为一个5输入单输出的神经元，输入有5个权重，即 $w_{1}^{(2)},w_{2}^{(2)},w_{3}^{(2)},w_{4}^{(2)},w_{5}^{(2)}$ ，阈值定义为 $B 2$ ，是一个数组，维数为5。

对于输入层，输入即为输出。

对于隐层，输入 $hide_in=x⋅w−bhide\_in=x\cdot w-b$ ，输出 $hide\_out=f(hide\_in)$ 其中 $f$ 为激发函数，在选择上具有一定的随机性，选择S型函数 $S i g m o i d$ 作为激发函数，因为S型函数具有输出范围有限、易于求导等特点，计算量小，结果不易发散，且编程简单。所以:
在这里插入图片描述

$$	sighide\_out&=sigmoid(hide\_in) \notag\\=\frac{1}{1+e^{-(hide\_in)}}$$

对于输出层，只有一个5输入、单输出的神经元，将权重表示为向量形式 $W2(5×1)W2_{(5\times 1)}$ ，则 $y_out=hide_out(1×5)⋅W2(5×1)−B2(1×1)y\_out = hide\_out_{(1\times 5)}\cdot W2_{(5\times 1)}- B2_{(1\times 1)}$

误差

误差计算公式为： $e = y\_out - y[i]$ ，其中 $y [i]$ 为每个训练样本的实际值。

反向修正

神经网络的一大特点即为可以反复修正，通过误差的大小自动调整拟合模型，最终无限接近于样本。
训练的目标为各个输入的权重和神经元的阈值。首先按照式 $dB2=−threshold⋅edW2=e⋅threshold⋅hide_out dB2 = -threshold \cdot e \\ dW2 = e \cdot threshold \cdot hide\_out$

最后修正中间层(输入层没有可改变的权重与阈值，所以不需要修正)，如式\ref{单输入中间层修正}。其中， $sigmoid(hide_in)⋅(1−sigmoid(hide_in))sigmoid(hide\_in)\cdot (1 - sigmoid(hide\_in))$ 为 $sigmoid(hide\_in)$ 的导数。
$dB1=W2⋅sigmoid(hide_in)⋅(1−sigmoid(hide_in))⋅−e⋅thresholdW1=W2⋅sigmoid(hide_in)⋅(1−sigmoid(hide_in))⋅x⋅e⋅threshold dB1 = W2 \cdot sigmoid(hide\_in)\cdot (1 - sigmoid(hide\_in)) \cdot -e \cdot threshold \\ W1 = W2 \cdot sigmoid(hide\_in)\cdot (1 - sigmoid(hide\_in)) \cdot x \cdot e \cdot threshold$

最后将用W和B减去对应的改变值dW和dB，即可得到新的权值和阈值，由循环语句进行循环迭代即可，如式\ref{四个式子}。
$dW1\\ B1 = B1 - dB1\\ W2 = W2 - dW2\\ B2 = B2 - dB2\\$

程序编写与结果输出

编写如下程序：

# edit by JBR，2020年10月31日
import numpy as np
import math
import matplotlib.pyplot as plt
x = np.linspace(0, 6, 30)
x_size = x.size
y = np.zeros((x_size, 1))
for i in range(x_size):
    y[i] = math.sin(x[i])+0.01*math.e**x[i]  # 被拟合函数
hidesize = 5  # 隐层数量
W1 = np.random.random((hidesize, 1))  # 输入层与隐层之间的权重
B1 = np.random.random((hidesize, 1))  # 隐含层神经元的阈值
W2 = np.random.random((1, hidesize))  # 隐含层与输出层之间的权重
B2 = np.random.random((1, 1))  # 输出层神经元的阈值
threshold = 0.005  # 迭代速度
max_steps = 5000  # 迭代最高次数，超过此次数即会退出
def sigmoid(x_):
    y_ = 1 / (1 + math.exp(-x_))
    return y_
E = np.zeros((max_steps, 1))  # 误差随迭代次数的变化
Y = np.zeros((x_size, 1))  # 模型的输出结果
for k in range(max_steps):   # k是会自加的，傻了傻了，找了半天的k=k+1
    temp = 0
    for i in range(x_size):
        hide_in = np.dot(x[i], W1) - B1  # 隐含层输入数据,W1,hidesize行，1列
        # print(x[i])
        hide_out = np.zeros((hidesize, 1))  # hide_out是隐含层的输出数据，这里初始化
        for j in range(hidesize):
            hide_out[j] = sigmoid(hide_in[j])  # 计算hide_out
        y_out = np.dot(W2, hide_out) - B2  # 模型输出
        Y[i] = y_out
        # print(i,Y[i])
        e = y_out - y[i]  # 模型输出减去实际结果。得出误差
        ##反馈，修改参数
        dB2 = -1 * threshold * e
        dW2 = e * threshold * np.transpose(hide_out)
        dB1 = np.zeros((hidesize, 1))
        for j in range(hidesize):
            dB1[j] = np.dot(np.dot(W2[0][j], sigmoid(hide_in[j])), (1 - sigmoid(hide_in[j])) * (-1) * e * threshold)
            # np.dot((sigmoid(hide_in[j])), (1 - sigmoid(hide_in[j])))为sigmoid(hide_in[j])的导数
        dW1 = np.zeros((hidesize, 1))
        for j in range(hidesize):
            dW1[j] = np.dot(np.dot(W2[0][j], sigmoid(hide_in[j])), (1 - sigmoid(hide_in[j])) * x[i] * e * threshold)
        W1 = W1 - dW1
        B1 = B1 - dB1
        W2 = W2 - dW2
        B2 = B2 - dB2
        temp = temp + abs(e)
    E[k] = temp
    if k % 50 == 0:
        print(k)
plt.figure()
plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.xlabel("x")
plt.ylabel("y,Y")
plt.title('y=sinx+0.01e^x')
plt.plot(x, y)
plt.plot(x, Y, color='red', linestyle='--')
plt.show()

程序运行结果如图
在这里插入图片描述

可以看出来一定的拟合效果，但是效果不佳，将迭代次数由5000次加至20000次(即第15行 $max_steps = 5000$ 改成 $max_steps = 5000$ )，效果如图\ref{y图像拟合20000次}。已经可以很好地拟合原函数，实验成功。

问题再次分析

至此，单输入单输出系统已经成功拟合，但是如果需要修改成针对多输入系统的拟合程序，在维度方面需要仔细改动。

多输入单输入函数即使用题目中定义的二元非线性函数。

神经网络设置

由题意，神经网络的输入层设置2个单输入、5输出的神经元组，输入与输出相等，即权重为1，阈值为0，使用 $s i g n$ 函数进行激发。中间层设置7个神经元，输出层为一个7输入单输出的神经元，网络构成如图\ref{2输入拓扑}所示的结构。由图示可以分析出中间层神经元输入有7个权重，即:

$w_{11}^{(1)},w_{12}^{(1)},w_{13}^{(1)},w_{14}^{(1)},w_{15}^{(1)},w_{16}^{(1)},w_{17}^{(1)};w_{21}^{(1)},w_{22}^{(1)},w_{23}^{(1)},w_{24}^{(1)},w_{25}^{(1)},w_{26}^{(1)},w_{27}^{(1)}$

阈值定义为 $B 1$ ，是一个数组，维数为5。输出层为一个5输入单输出的神经元，输入有5个权重，即 $w_{1}^{(2)},w_{2}^{(2)},w_{3}^{(2)},w_{4}^{(2)},w_{5}^{(2)},w_{6}^{(2)},w_{7}^{(2)}$ ，阈值定义为 $B 2$ ，是一个数组，维数为7。
$\begin{figure}[!htbp]\centering\includegraphics[width=0.8\textwidth]{tope2.pdf}\caption{2输入单输出神经网络示意图}\label{2输入拓扑}\end{figure}\subsubsection{正向网络}$

对于输入层，输入即为输出。

对于隐层，输入 $hide_in=(x,y)×w−bhide\_in=(x,y)\times w-b$ ，输出 $hide_out=f(hide_in)$ 其中 $f$ 为激发函数，在选择上具有一定的随机性，依然选择S型函数 $S i g m o i d$ 作为激发函数。

对于输出层，只有一个5输入、单输出的神经元，将权重表示为向量形式 $W2(7×1)W2_{(7\times 1)}$ ，则 $y_out=hide_out(1×7)⋅W2(7×1)−B2(1×1)y\_out = hide\_out_{(1\times 7)}\cdot W2_{(7\times 1)}- B2_{(1\times 1)}$
\subsubsection{误差}
误差计算公式为： $e = z\_out - z[j][i]$ ，其中 $z [j] [i]$ 为 $y[j]和x[i]下z=sin⁡x+0.01eyy[j]\text{和}x[i]\text{下}z=\sin x+0.01e^y$ 对应的每个训练样本的实际值。

反向修正

按照式\ref{2输入最后一层修正}从输出层开始修正，其中：
$dB2=−threshold⋅edW2=e⋅threshold⋅hide_out dB2 = -threshold \cdot e \\ dW2 = e \cdot threshold \cdot hide\_out$

再修正中间层，如式\ref{2输入中间层修正}。
$sigmoid(hide_in)\cdot (1 - sigmoid(hide_in[m]))- e \cdot threshold)\\ dW1y = (W2\cdot sigmoid(hide_in)\cdot (1 - sigmoid(hide_in)\cdot y\cdot e \cdot threshold \\ W1y = W1y - dW1y \\ dW1x = W2 \cdot sigmoid(hide_in) \cdot (1 - sigmoid(hide_in)\cdot x \cdot e \cdot threshold \\ W1x = W1x - dW1x \\$

式\ref{2输入中间层修正}和式\ref{单输入中间层修正}，区别在于因为维度不同所以在修正时需要添加语句，对x输入和y输入带来的权重单独修正。

程序编写与结果输出

# edit by JBR，2020年10月31日
import numpy as np
import math
import matplotlib.pyplot as plt
import pylab as pl
import mpl_toolkits.mplot3d
x = np.linspace(0, 6, 13)
y = np.linspace(0, 6, 13)
[X, Y] = np.meshgrid(x, y)
x_size = x.size
y_size = y.size
z = np.zeros((y_size, x_size))
for i in range(x_size):
    for j in range(y_size):
        z[j][i] = math.sin(x[i]) + 0.01 * math.e ** y[j]
        # z[i][j] = math.sin(x[i]) + 0.01
hidesize = 7  # 隐层数量
W1x = np.random.random((hidesize, 1))  # 输入层与隐层之间的权重
W1y = np.random.random((hidesize, 1))  # 输入层与隐层之间的权重
B1 = np.random.random((hidesize, 1))  # 隐含层神经元的阈值
W2 = np.random.random((1, hidesize))  # 隐含层与输出层之间的权重
B2 = np.random.random((1, 1))  # 输出层神经元的阈值
threshold = 0.007  # 迭代速度
max_steps = 20  # 迭代最高次数，超过此次数即会退出
def sigmoid(x_):  # 这里x_和y_在函数里面，不需要改
    y_ = 1 / (1 + math.exp(-x_))
    return y_
E = np.zeros((max_steps, 1))  # 误差随迭代次数的变化
Z = np.zeros((x_size, y_size))  # 模型的输出结果
for k in range(max_steps):
    temp = 0
    for i in range(x_size):
        for j in range(y_size):
            hide_in = np.dot(x[i], W1x) + np.dot(y[j], W1y) - B1  # 隐含层输入数据
            # print(x[i])
            hide_out = np.zeros((hidesize, 1))  # 隐含层的输出数据
            for m in range(hidesize):
                hide_out[m] = sigmoid(hide_in[m])  # 计算hide_out
                z_out = np.dot(W2, hide_out) - B2  # 模型输出
            Z[j][i] = z_out
            e = z_out - z[j][i]  # 模型输出减去实际结果。得出误差
            # 反馈，修改参数
            dB2 = -1 * threshold * e
            dW2 = e * threshold * np.transpose(hide_out)
            dB1 = np.zeros((hidesize, 1))
            for m in range(hidesize):
                dB1[m] = np.dot(np.dot(W2[0][m], sigmoid(hide_in[m])), (1 - sigmoid(hide_in[m])) * (-1) * e * threshold)
            dW1x = np.zeros((hidesize, 1))
            dW1y = np.zeros((hidesize, 1))
            for m in range(hidesize):
                dW1y[m] = np.dot(np.dot(W2[0][m], sigmoid(hide_in[m])), (1 - sigmoid(hide_in[m])) * y[j] * e * threshold)
            W1y = W1y - dW1y
            for m in range(hidesize):
                dW1x[m] = np.dot(np.dot(W2[0][m], sigmoid(hide_in[m])), (1 - sigmoid(hide_in[m])) * x[i] * e * threshold)
            W1x = W1x - dW1x
            B1 = B1 - dB1
            W2 = W2 - dW2
            B2 = B2 - dB2
            temp = temp + abs(e)
    E[k] = temp
    if k % 2 == 0:
        print(k)
# new a figure and set it into 3d
fig = plt.figure()
# set figure information
ax = plt.axes(projection='3d')
ax.set_title("z=sinx+0.01e^y")
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.set_zlabel("z")
ax.plot_surface(X, Y, z, cmap='rainbow')
plt.figure()
ax = plt.axes(projection='3d')
ax.set_title("fitting z=sin x+0.01e^y")
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.set_zlabel("z")
print(x)
print(z)
print(Z)
ax.plot_surface(X, Y, Z, cmap='rainbow')
plt.show()

运行程序，结果如图

在这里插入图片描述

，对比可以看出来一定的拟合效果，但是效果不佳。分析原因，可能是样本数量，神经元数量过少，双输入单输出的情况下输出结果不依靠某一个输入数据，所以需要更为复杂的隐层结构。

将迭代次数由20次加至200次，效果如图\ref{z拟合200次}，拟合结果已经非常优秀，但是在程序运行时，耗时过长，且对计算机的运算速度和内存容量有较大要求，所以将迭代次数适当减少，同时增加中间神经元数量——将中间层神经元数量修改为15个(即第17行 $max_steps = 7$ 改成 $max_steps = 15$ )，迭代次数调整为100次。效果如图\ref{z拟合50神经元}。可以很好地拟合原函数，实验成功。