【他山之石】从零开始实现一个卷积神经网络
“他山之石,可以攻玉”,站在巨人的肩膀才能看得更高,走得更远。在科研的道路上,更需借助东风才能更快前行。为此,我们特别搜集整理了一些实用的代码链接,数据集,软件,编程技巧等,开辟“他山之石”专栏,助你乘风破浪,一路奋勇向前,敬请关注。
地址:https://zhuanlan.zhihu.com/p/355527103
本教程基于pytorch,使用LeNet作为神经网络模型,MNIST作为训练和测试的数据集。本教程将从python、pytorch、CUDA的安装开始,最后实现一个可用的手写数字分类器。本教程适合对python、深度学习和卷积神经网络有初步的了解。
01
python环境搭建
CUDA下载及安装
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
pytorch安装
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio===0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
将这行命令复制粘贴至命令行后运行即可进行下载。若下载速度缓慢则可以考虑挂梯子或者更换国内镜像源,更换源方法详见链接[7]。
编辑器安装
02
03
分析模型
搭建模型
import torch
import torch.nn as nn
class LeNet(nn.Module):
def __init__(self):
super(LeNet, self).__init__()
nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding, dilation, groups, bias, padding_mode)
in_channels:输入的通道数
nn.MaxPool2d(kernel_size, stride, padding, dilation, return_indices, ceil_mode)
kernel_size:池化核的大小。
nn.ReLU(inplace)
nn.Linear(in_features, out_features, bias)
self.C1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5, stride=1, padding=2)
self.R1 = nn.ReLU()
self.S2 = nn.MaxPool2d(kernel_size=2)
self.C3 = nn.Conv2d(6, 16, 5, 1, 0)
self.R2 = nn.ReLU()
self.S4 = nn.MaxPool2d(2)
self.C5 = nn.Conv2d(16, 120, 5, 1, 0)
self.R3 = nn.ReLU()
self.F6 = nn.Linear(in_features=120, out_features=84)
self.R4 = nn.ReLU()
self.OUT = nn.Linear(84, 10)
def forward(self, x):
x = self.C1(x)
x = self.R1(x)
x = self.S2(x)
x = self.C3(x)
x = self.R2(x)
x = self.S4(x)
x = self.C5(x)
x = self.R3(x)
x = x.view(x.size(0), -1)
x = self.F6(x)
x = self.R4(x)
x = self.OUT(x)
return x
import torch
import torch.nn as nn
class LeNet(nn.Module):
def __init__(self):
super(LeNet, self).__init__()
self.C1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5, stride=1, padding=2)
self.R1 = nn.ReLU()
self.S2 = nn.MaxPool2d(kernel_size=2)
self.C3 = nn.Conv2d(6, 16, 5, 1, 0)
self.R2 = nn.ReLU()
self.S4 = nn.MaxPool2d(2)
self.C5 = nn.Conv2d(16, 120, 5, 1, 0)
self.R3 = nn.ReLU()
self.F6 = nn.Linear(in_features=120, out_features=84)
self.R4 = nn.ReLU()
self.OUT = nn.Linear(84, 10)
def forward(self, x):
x = self.C1(x)
x = self.R1(x)
x = self.S2(x)
x = self.C3(x)
x = self.R2(x)
x = self.S4(x)
x = self.C5(x)
x = self.R3(x)
x = x.view(x.size(0), -1)
x = self.F6(x)
x = self.R4(x)
x = self.OUT(x)
return x
if __name__ == "__main__":
model = LeNet()
a = torch.randn(1, 1, 28, 28)
b = model(a)
print(b)
04
import torchvision
torchvision.datasets.MNIST('./data', download=True)
05
初始化和导入模型
import torch
import torchvision
import torch.nn as nn
import torch.utils.data as Data
from model import LeNet
model = LeNet()
定义超参数、数据集和DataLoader
Epoch = 5
batch_size = 64
lr = 0.001
train_data = torchvision.datasets.MNIST(root='./data/', train=True, transform=torchvision.transforms.ToTensor(), download=False)
定义完训练集后我们需要定义一个DataLoader将train_data中的数据喂给模型。在pytorch中,DataLoader的定义如下:
Data.DataLoader(dataset, batch_size, shuffle, sampler, batch_sampler, num_workers, collate_fn, pin_memory, drop_last, timeout, worker_init_fn, prefetch_factor, persistent_workers)
以下是其常用参数的介绍:
train_loader = Data.DataLoader(train_data, batch_size=batch_size, shuffle=True, num_workers=0, drop_last=True)
定义损失函数和优化器
loss_function = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
启用梯度
torch.set_grad_enabled(True)
model.train()
使用CUDA加速
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)
训练
for epoch in range(Epoch):
for step, data in enumerate(train_loader):
for epoch in range(Epoch):
for step, data in enumerate(train_loader):
x, y = data
optimizer.zero_grad()
y_pred = model(x.to(device, torch.float))
loss = loss_function(y_pred, y.to(device, torch.long))
loss.backward()
optimizer.step()
保存模型
torch.save(model, './LeNet.pkl')
训练过程可视化
for epoch in range(Epoch):
running_loss = 0.0
acc = 0.0
for step, data in enumerate(train_loader):
running_loss += float(loss.data.cpu())
acc += (pred.data.cpu() == y.data).sum()
if step % 100 == 99:
loss_avg = running_loss / (step + 1)
acc_avg = float(acc / ((step + 1) * batch_size))
print('Epoch', epoch + 1, ',step', step + 1, '| Loss_avg: %.4f' % loss_avg, '|Acc_avg:%.4f' % acc_avg)
如果你照着写到了这里,那么你已经可以很好地对模型进行训练了,而且可以在训练过程中直接查看损失和准确率的变化,当你训练完毕并获得LeNet.pkl文件后,恭喜你,你的第一个模型训练成功了!
import torch
import torchvision
import torch.nn as nn
import torch.utils.data as Data
from model import LeNet
model = LeNet()
Epoch = 5
batch_size = 64
lr = 0.001
train_data = torchvision.datasets.MNIST(root='./data/', train=True, transform=torchvision.transforms.ToTensor(), download=False)
train_loader = Data.DataLoader(train_data, batch_size=batch_size, shuffle=True, num_workers=0, drop_last=True)
loss_function = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
torch.set_grad_enabled(True)
model.train()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)
for epoch in range(Epoch):
running_loss = 0.0
acc = 0.0
for step, data in enumerate(train_loader):
x, y = data
optimizer.zero_grad()
y_pred = model(x.to(device, torch.float))
loss = loss_function(y_pred, y.to(device, torch.long))
loss.backward()
running_loss += float(loss.data.cpu())
pred = y_pred.argmax(dim=1)
acc += (pred.data.cpu() == y.data).sum()
optimizer.step()
if step % 100 == 99:
loss_avg = running_loss / (step + 1)
acc_avg = float(acc / ((step + 1) * batch_size))
print('Epoch', epoch + 1, ',step', step + 1, '| Loss_avg: %.4f' % loss_avg, '|Acc_avg:%.4f' % acc_avg)
torch.save(model, './LeNet.pkl')
06
初始化、导入模型和数据集
import torch
import torchvision
import torch.utils.data as Data
test_data = torchvision.datasets.MNIST(root='./data/', train=False, transform=torchvision.transforms.ToTensor(), download=False)
test_loader = Data.DataLoader(test_data, batch_size=1, shuffle=False)
之后,定义我们需要使用的设备:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net = torch.load('./LeNet.pkl',map_location=torch.device(device))
关闭梯度
torch.set_grad_enabled(False)
net.eval()
测试及输出结果
length = test_data.data.size(0)
for i, data in enumerate(test_loader):
x, y = data
y_pred = net(x.to(device, torch.float))
pred = y_pred.argmax(dim=1)
acc += (pred.data.cpu() == y.data).sum()
print('Predict:', int(pred.data.cpu()), '|Ground Truth:', int(y.data))
最后,我们需要计算模型在测试集上的准确率,即用预测正确的个数除以测试集的大小,并将其输出出来。为了美观,我们可以把它写成百分比的形式,写法如下:
acc = (acc / length) * 100
print('Accuracy: %.2f' %acc, '%')
import torch
import torchvision
import torch.utils.data as Data
test_data = torchvision.datasets.MNIST(root='./data/', train=False, transform=torchvision.transforms.ToTensor(), download=False)
test_loader = Data.DataLoader(test_data, batch_size=1, shuffle=False)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net = torch.load('./LeNet.pkl',map_location=torch.device(device))
net.to(device)
torch.set_grad_enabled(False)
net.eval()
length = test_data.data.size(0)
acc = 0.0
for i, data in enumerate(test_loader):
x, y = data
y_pred = net(x.to(device, torch.float))
pred = y_pred.argmax(dim=1)
acc += (pred.data.cpu() == y.data).sum()
print('Predict:', int(pred.data.cpu()), '|Ground Truth:', int(y.data))
acc = (acc / length) * 100
print('Accuracy: %.2f' %acc, '%')
07
[1] https://www.python.org/
[2] https://www.anaconda.com/
[3] https://developer.nvidia.com/cuda-gpus#collapseOne
[4] https://developer.nvidia.com/cuda-toolkit
[5] https://pytorch.org/
[6] https://pytorch.org/get-started/locally/
[7] https://jingyan.baidu.com/article/d5c4b52b21b63e9b570dc574.html
[8] https://code.visualstudio.com/
[9] https://www.jetbrains.com/pycharm/
[10] https://cuijiahua.com/blog/2018/01/dl_3.html
[11] https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d
[12] https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d
[13] https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU
[14] https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear
[15] https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/1YWfieeG1c8w4JkpXBfl-rQ,分享码hp2q
[16] https://pytorch.org/docs/stable/data.html#loading-batched-and-non-batched-data
“他山之石”历史文章
斯坦福大规模网络数据集
超轻量的YOLO-Nano
MMAction2: 新一代视频理解工具箱
TensorFlow神经网络实现二分类的正确姿势
人类早期驯服野生机器学习模型的珍贵资料
不会强化学习,只会numpy,能解决多难的RL问题?
技术总结《OpenAI Gym》
ROC和CMC曲线的理解(FAR, FRR的理解)
pytorch使用hook打印中间特征图、计算网络算力等
Ray和Pytorch Lightning 使用指北
如何在科研论文中画出漂亮的插图?
PyTorch 源码解读之 torch.optim:优化算法接口详解
AI框架基础技术之深度学习中的通信优化
SimCLR:用于视觉表征的对比学习框架
Pytorch Autograd与计算图
更多他山之石专栏文章,
请点击文章底部“阅读原文”查看
分享、点赞、在看,给个三连击呗!