2020.1.13 note
AdderNet: Do We Really Need Multiplications in Deep Learning?
Compared with cheap addition operation, multiplication operation is of much higher computation complexity. The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values. In this paper, they present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. In AdderNets, they take the L1-norm distance between filters and input feature as the output response. The influence of this new similarity measure on the optimization of neural network have been thoroughly analyzed. To achieve a better performance, they develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. They then propose an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the magnitude of each neuron’s gradient. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in convolution layer.

这篇博客探讨了AdderNet如何用加法替代乘法降低深度学习的计算复杂性,以及优化深度学习的理论和算法。研究发现,神经网络通过自我调整的退火策略寻找泛化的解决方案,并且在没有乘法操作的情况下,AdderNets在ImageNet上仍能达到高准确率。此外,还讨论了CNN生成图像的可识别性、损失表面的性质以及深度神经网络决策边界的特性。

1131

被折叠的 条评论
为什么被折叠?



