What is the Recurrent Neural Network?
Defination
A recurrent neural network (RNN) is a class of artificial neural network where connections between nodes form a directed graph along a temporal sequence.
The sturcture of RNN
A recurrent neural network comprises an input layer, a hidden layer, and an output layer.
The input layer is responsible for fetching the data, which performs the data preprocessing, followed by passing the filtered data into the hidden layer.
A hidden layer consists of neural networks, algorithms, and activation functions for retrieving useful information out of the data. Finally, the information is sent to the output layer to provide the expected outcome.

RNN vs feed-forward neural network
Here we compare RNN with feed-forward network.
As a feed-forward neural network considers only a current input, it has no perception of what has happened in the past except the training procedures.
But in RNN, The information that passed through the architecture goes through a loop. Each input is dependent on the previous one for making decisions. RNN assigns the same and equal weight and bias for each of the layers in the network.
The independent variables are converted to dependent variables.

Long-Term Dependencies
That is to say, W can not be used in neural network training for a long time.
Sometimes we just need to look at recent information to perform the present task. For example, consider a language model trying to predict the last word in “ the clouds are in the sky”. Here it’s easy to predict the next word as sky based on the previous words. But consider the sentence “ I grew up in France I speak fluent French. “ Here it is not easy to predict that the language is French directly. It depends on previous input also. In such sentences it’s entirely possible for the gap between the relevant information and the point where it is needed to become very large. In theory, RNN’s are absolutely capable of handling such “long-term dependencies.”
A human could carefully pick parameters for them to solve toy problems of this form. Sadly, in practice, recurrent neural network don’t seem to be able to learn them.This problem is called Vanishing gradient problem.The neural network updates the weight using the gradient descent algorithm. The gradients grow smaller when the network progress down to lower layers.The gradients will stay constant meaning there is no space for improvement. The model learns from a change in the gradient. This change affects the network’s output. However, if the difference in the gradients is very small network will not learn anything and so no difference in the output. Therefore, a network facing a vanishing gradient problem cannot converge towards a good solution.

The characteristics of RNN
- The weights are shared, so W is all the same, U and V are all the same.
- The previous output affects the subsequent output and is suitable for processing sequence data.
- Losses also accumulate with the recommendation of the sequence.
Application
RNN is used in many ways. Such as:
Automatic Speech Recognition: to generate the corresponding speech text information.
Machine translation: The interchange between different languages.
Generate image description: Give a picture that describes what’s in the picture.

本文深入解析了循环神经网络(RNN)的定义、结构特点,将其与前馈神经网络对比,并探讨了长短期依赖问题及训练中的Vanishing Gradient难题。重点介绍了RNN在语音识别、机器翻译和图像描述等领域的应用。

1万+

被折叠的 条评论
为什么被折叠?



