其他
【源头活水】NiN 论文阅读
“问渠那得清如许,为有源头活水来”,通过前沿领域知识的学习,从其他研究领域得到启发,对研究问题的本质有更清晰的认识和理解,是自我提高的不竭源泉。为此,我们特别精选论文阅读笔记,开辟“源头活水”专栏,帮助你广泛而深入的阅读科研文献,敬请关注。
地址:https://www.zhihu.com/people/zhi-zuo-ren-23
01
增强非线性特征提取能力
By abstraction we mean that the feature is invariant to the variants of the same concept
The conventional convolutional layer uses linear filters followed by a nonlinear activation function to scan the input.
CNN implicitly makes the assumption that the latent concepts are linearly separable
However, the data for the same concept often live on a nonlinear manifold, therefore the representations that capture these concepts are generally highly nonlinear function of the input.
MLP的连接形式
The cross channel parametric pooling layer is also equivalent to a convolution layer with 1x1 convolution kernel
加入非线性,卷积层之后经过激励层,1*1的卷积在前一层的学习表示上添加了非线性激励,提升网络的表达能力。在本文中尤其着重于高水平抽象特征提取的提升 增加模型深度,可以减少网络模型参数,增加网络层深度,一定程度上提升模型的表征能力 跨通道信息交互,使用1x1卷积核,实现通道数改变操作其实就是channel间信息的线性组合变化
02
全连接层的输出要重新整合成图像的形式才能传入下一个块(虽然这是很容易做到的),不自然 大的全连接层参数过多,重复进行占用内存过大,而且极易过拟合(主要原因)
03
无参数,避免了过拟合,极大极大减小了模型的内存压力 当最后一层特征图数量设计为类别数时,具有很好的可解释性 它是对全局取平均,对空间上的变化(平移等)更加鲁棒
We can see global average pooling as a structural regularizer that explicitly enforces feature maps to be confidence maps of concepts (categories). This is made possible by the mlpconv layers, as they makes better approximation to the confidence maps than GLMs.
04
import torch
from torch import nn
def nin_block(in_channels, out_channels, kernel_size= 1, stride= 1, padding= 0):
block = nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding),
nn.ReLU(inplace= True),
nn.Conv2d(out_channels, out_channels, 1, 1, 0),
nn.ReLU(inplace= True),
nn.Conv2d(out_channels, out_channels, 1, 1, 0),
nn.ReLU(inplace= True)
)
return block
class NiN(nn.Module):
def __init__(self, num_classes):
super(NiN, self).__init__()
self.num_classes = num_classes
self.network = nn.Sequential(
#(1*227*227)
nin_block(1, 96, kernel_size= 11, stride= 4, padding= 0), #(96*55*55)
nn.Dropout(),
nn.MaxPool2d(kernel_size=3, stride= 2), #(96*27*27)
nin_block(96,256,5,1,2), #(256*27*27)
nn.Dropout(),
nn.MaxPool2d(kernel_size= 3, stride= 2), #(256*13*13)
nin_block(256,384,kernel_size= 3, stride= 1, padding= 1), #(384*13*13)
nn.Dropout(),
nn.MaxPool2d(kernel_size= 3, stride= 2), #(384,6,6)
nin_block(384,self.num_classes,3,1,1),
nn.AdaptiveAvgPool2d(1)
)
def forward(self, x):
out = self.network(x)
out = out.view(out.shape[0], -1)
return out
def main():
net = NiN(num_classes= 10)
print(net)
img = torch.rand((2,1,227,227))
print(img)
out = net(img)
print(out)
if __name__ == '__main__':
main()
05
[1]Lin, M., Chen, Q., & Yan, S. (2013). Network in network. arXiv preprint arXiv:1312.4400
[2]https://www.cnblogs.com/missidiot/p/9378079.html
[3]https://www.cnblogs.com/jiangxinyang/p/9314256.html
[4]https://blog.csdn.net/qq_16234613/article/details/79689681
[5]JMLR, 2019. Neural Architecture Search: A Survey
本文目的在于学术交流,并不代表本公众号赞同其观点或对其内容真实性负责,版权归原作者所有,如有侵权请告知删除。
“源头活水”历史文章
强化学习论文阅读笔记:RODE
实例分割(SOLOv2|NIPS2020)——增强版SOLO
ICLR2021 | 显存不够?不妨抛弃端到端训练
EEGdenoiseNet:使用神经网络进行EEG去噪
Zero-Shot Learning in Modern NLP 现代NLP中的零样本学习
XLNet--自回归语言模型的复兴
联邦学习 | FedProx 算法
Deep Layer Aggregation - 聚合不同尺度特征图的架构
CVPR 2021 | 利用时序差分进行动作识别的最新Backbone--TDN
视觉子领域中的Transformer
一种高效评估预训练模型是否适合当前任务的方法
CVPR'21 | involution:超越convolution和self-attention的神经网络新算子
特征提取网络HS-ResNet
CVPR2021:目标引导的人类注意力估计提升零样本学习
更多源头活水专栏文章,
请点击文章底部“阅读原文”查看
分享、在看,给个三连击呗!