【他山之石】教你用PyTorch玩转Transformer英译中翻译模型!
“他山之石,可以攻玉”,站在巨人的肩膀才能看得更高,走得更远。在科研的道路上,更需借助东风才能更快前行。为此,我们特别搜集整理了一些实用的代码链接,数据集,软件,编程技巧等,开辟“他山之石”专栏,助你乘风破浪,一路奋勇向前,敬请关注。
地址:https://www.zhihu.com/people/xia-he-ming-41
01
02
[
['Some analysts argue that the negative effects of such an outcome would only last for “months.”',
'某些分析家认为军事行动的负面效果只会持续短短“几个月”。'],
['The Fed apparently could not stomach the sell-off in global financial markets in January and February, which was driven largely by concerns about further tightening.',
'美联储显然无法消化1月和2月的全球金融市场抛售,而这一抛售潮主要是因为对美联储进一步紧缩的担忧导致的。']
]
把原始语料中的中英文句对按照英文句子的长度排序,使得每个batch中的句子长度相近。 利用训练好的分词模型分别对中英文句子进行分词,利用词表将其转换为id。 在每个 id sequence 的首尾加上起始符和终止符,并将其转换为Tensor。
03
为加速解码过程,我们将greedy decode基于batch重新实现。 transformer-pytorch使用的pytorch版本较早,我们修改了其与pytorch 1.5.1版本不兼容的地方。
04
05
06
07
08
09
注:以下三个case都是基于Pytorch model的最优训练模型(即Model 2)的翻译结果
10
参考
1. abcdefAshish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need, 2017.
2. Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1715–1725, Berlin, Germany, August 2016. Association for Computational Linguistics.
3. Taku Kudo and John Richardson. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing.
4. abKaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015.
5. abcPriya Goyal, Piotr Dollár, Ross B. Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. Accurate, large minibatch SGD: training imagenet in 1 hour. CoRR, abs/1706.02677, 2017.
6. Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. CoRR, abs/1512.00567, 2015.
7. Matt Post. A call for clarity in reporting BLEU scores. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 186–191, Brussels, Belgium, October 2018. Association for Computational Linguistics.
8. Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu, and Hang Li. Modeling coverage for neural machine translation, 2016.
9. Zhaopeng Tu, Yang Liu, Lifeng Shang, Xiaohua Liu, and Hang Li. Neural machine translation with reconstruction, 2016.
“他山之石”历史文章
深度学习工程技巧之网格调参
PyTorch使用预训练模型进行模型加载
深度学习调参经验总结
PyTorch实现断点继续训练
Pytorch/Tensorflow-gpu训练并行加速trick(含代码)
从NumPy开始实现一个支持Auto-grad的CNN框架
pytorch_lightning 全程笔记
深度学习中的那些Trade-off
PyTorch 手把手搭建神经网络 (MNIST)
autograd源码剖析
怎样才能让你的模型更加高效运行?
来自日本程序员的纯C++深度学习库tiny-dnn
MMTracking: OpenMMLab 一体化视频目标感知平台
深度学习和机器视觉top组都在研究什么
更多他山之石专栏文章,
请点击文章底部“阅读原文”查看
分享、点赞、在看,给个三连击呗!