干货|超实用的 Python/NumPy实现的多模神经网络语言模型（附Github）

此内容因违规无法查看此内容因言论自由合法查看。

文章于 2017年3月5日被检测为删除。

查看原文

被微信屏蔽

其他

干货|超实用的 Python/NumPy实现的多模神经网络语言模型（附Github）

2017-03-04 全球人工智能

选自：Github

翻译：张妮娜

在基础NumPy系统中实现“多模态神经语言模型”（Kiros et al, ICML 2014），包含加法和乘法的对数双线性图像字幕生成器。这些模型不同于其他大多数图像字幕生成器，它们不使用循环神经网络。

如果你正在寻找一个可以在CPU上训练的、既简单又基础的图像字幕生成器，此代码可能对你有用，它也可以用于教学目的。这个代码被用为多伦多大学本科神经网络课程作业的一部分。

使用VGG19模型的MSCOCO数据集，单个模型以25分的成绩实现BLEU4，而整体上可以达到近27分。相比较而言，具有相同特征的“Show and Tell”长短时记忆网络则取得27分之多，而现有技术水平大约为34分。因此，这些模型与当前的技术水平相距甚远。作为我的博士论文的一部分，在此公开此代码。

Visualization

Here are results on 1000 images using an ensemble of additive log-bilinear models trained using this code.

Dependencies

This code is written in python. To use it you will need:

Python 2.7
A recent version of NumPy （http://www.numpy.org/）and SciPy（http://www.scipy.org/）

Quickstart for Toronto users

To train an additive log-bilinear model with the default settings, open IPython and run the following:

import coco_proc, trainer
z, zd, zt = coco_proc.process(context=5)
trainer.trainer(z, zd)

this will store trained models in the models directory and periodically compute BLEU using the Perl code and reference captions in the gen directory. All the hyperparameters settings can be tuned in trainer.py. Links to MSCOCO data are in config.py.

Getting started

You will first need to download the pre-processed MSCOCO data. All necessary files can be downloaded by running:

wget http://www.cs.toronto.edu/~rkiros/data/mnlm.zip

After unpacking, open config.py and set the paths accordingly. Then you can proceed to the quickstart instructions. All training settings can be found in trainer.py. Testing trained models is done with tester.py. The lm directory contains classes for the additive and multiplicative log-bilinear models. Helper functions, such as beam search, is found in the utils directory.

Reference

If you found this code useful, please cite the following paper:

Ryan Kiros, Ruslan Salakhutdinov, Richard S. Zemel. "Multimodal Neural Language Models." ICML (2014).

@inproceedings{kiros2014multimodal,
  title={Multimodal Neural Language Models.},
  author={Kiros, Ryan and Salakhutdinov, Ruslan and Zemel, Richard S},
  booktitle={ICML},
  volume={14},
  pages={595--603},
  year={2014}
}

License

Apache License 2.0

（http://www.apache.org/licenses/LICENSE-2.0）

点击阅读原文跳转Github地址

热门文章推荐

19岁中国留学生投票被抓，“假装”公民身份！且已无法撤回.........

中国留学生在美国非法投票，后果很严重

19岁中国留学生非法投票美国大选，被控2罪！或被判15年监禁

恶魔医生刘翔峰判了，湘雅二院改好了吗？

中国在南极发现的“海上粮仓”能养活14亿人？

干货|超实用的 Python/NumPy实现的多模神经网络语言模型（附Github）

Visualization

Dependencies

Quickstart for Toronto users

Getting started

Reference

License

讨论|周志华教授gcForest论文的价值与技术讨论（微信群）

最新|李飞飞：人口普查不用上门，谷歌街景加深度学习就搞定（附论文）

最新 | 百度最新“Deep Voice”语音技术比WaveNet提速 400 倍（译）

重磅 |Boston Dynamics最新轮式暴跳机器人：身高1.98m 纵跳1.2m（译）

重磅 |“AI武器”已成特朗普媒体舆论首选工具！

技术 | 教你如何用“决策树算法”解决相亲问题？

技术|斯坦福实验室发布通用数据深度挖掘工具DeepDive

实战|Python和R中SVM和参数调优的简明教程

您可能也对以下帖子感兴趣

19岁中国留学生投票被抓，“假装”公民身份！且已无法撤回.........

中国留学生在美国非法投票，后果很严重

19岁中国留学生非法投票美国大选，被控2罪！或被判15年监禁

恶魔医生刘翔峰判了，湘雅二院改好了吗？

中国在南极发现的“海上粮仓”能养活14亿人？

生成图片，分享到微信朋友圈

干货|超实用的 Python/NumPy实现的多模神经网络语言模型（附Github）

Visualization

Dependencies

Quickstart for Toronto users

Getting started

Reference

License

您可能也对以下帖子感兴趣