其他
【他山之石】Pytorch mixed precision 概述(混合精度)
“他山之石,可以攻玉”,站在巨人的肩膀才能看得更高,走得更远。在科研的道路上,更需借助东风才能更快前行。为此,我们特别搜集整理了一些实用的代码链接,数据集,软件,编程技巧等,开辟“他山之石”专栏,助你乘风破浪,一路奋勇向前,敬请关注。
01
import torchvision
import torch
import torch.cuda.amp
import gc
import time
# Timing utilities
start_time = None
def start_timer():
global start_time
gc.collect()
torch.cuda.empty_cache()
torch.cuda.reset_max_memory_allocated()
torch.cuda.synchronize() # 同步后得出的时间才是实际运行的时间
start_time = time.time()
def end_timer_and_print(local_msg):
torch.cuda.synchronize()
end_time = time.time()
print("\n" + local_msg)
print("Total execution time = {:.3f} sec".format(end_time - start_time))
print("Max memory used by tensors = {} bytes".format(torch.cuda.max_memory_allocated()))
num_batches = 50
batch_size = 70
epochs = 3
# 随机创建训练数据
data = [torch.randn(batch_size, 3, 224, 224, device="cuda") for _ in range(num_batches)]
targets = [torch.randint(0, 1000, size=(batch_size, ), device='cuda') for _ in range(num_batches)]
# 创建一个模型
net = torchvision.models.resnext50_32x4d().cuda()
# 定义损失函数
loss_fn = torch.nn.CrossEntropyLoss().cuda()
# 定义优化器
opt = torch.optim.SGD(net.parameters(), lr=0.001)
# 是否使用混合精度训练
use_amp = True
# Constructs scaler once, at the beginning of the convergence run, using default args.
# If your network fails to converge with default GradScaler args, please file an issue.
# The same GradScaler instance should be used for the entire convergence run.
# If you perform multiple convergence runs in the same script, each run should use
# a dedicated fresh GradScaler instance. GradScaler instances are lightweight.
scaler = torch.cuda.amp.GradScaler(enabled=use_amp)
start_timer()
for epoch in range(epochs):
for input, target in zip(data, targets):
with torch.cuda.amp.autocast(enabled=use_amp):
output = net(input)
loss = loss_fn(output, target)
# 放大loss Calls backward() on scaled loss to create scaled gradients.
scaler.scale(loss).backward()
# scaler.step() first unscales the gradients of the optimizer's assigned params.
# If these gradients do not contain infs or NaNs, optimizer.step() is then called,
# otherwise, optimizer.step() is skipped.
scaler.step(opt)
# Updates the scale for next iteration.
scaler.update()
opt.zero_grad(set_to_none=True) # set_to_none=True here can modestly improve performance
end_timer_and_print("Mixed precision:")
02
本文目的在于学术交流,并不代表本公众号赞同其观点或对其内容真实性负责,版权归原作者所有,如有侵权请告知删除。
“他山之石”历史文章
Weights & Biases (兼容多种深度学习框架的可视化工具WB中文简介)
GCN实现及其中的归一化
Pytorch Lightning 完全攻略
Tensorflow之TFRecord的原理和使用心得
从零开始实现一个卷积神经网络
斯坦福大规模网络数据集
超轻量的YOLO-Nano
MMAction2: 新一代视频理解工具箱
TensorFlow神经网络实现二分类的正确姿势
人类早期驯服野生机器学习模型的珍贵资料
不会强化学习,只会numpy,能解决多难的RL问题?
技术总结《OpenAI Gym》
ROC和CMC曲线的理解(FAR, FRR的理解)
pytorch使用hook打印中间特征图、计算网络算力等
Ray和Pytorch Lightning 使用指北
更多他山之石专栏文章,
请点击文章底部“阅读原文”查看
分享、点赞、在看,给个三连击呗!