
环球•译事 | 纽约大学研究员:“神经机翻”为何会成为主流?

2017-11-15 Yee君 译·世界



Neural machine translation (NMT) is now mainstream. This was New York University Assistant Professor Kyunghyun Cho’s first message during his presentation on NMT at the recent SlatorCon New York on October 12, 2017.

如今,神经网络机器翻译(NMT)已经成为主流。这是纽约大学助理教授Kyunghyun Cho在10月12日于纽约举办的SlatorCON 论坛上,介绍NMT时透露的第一个信息。

When Cho’s team started looking into NMT in 2013 and 2014, he said previous MT researchers and industry insiders were convinced it would not work. Efforts in the 1980s and mid-1990s failed, after all.


Fast forward to 2017, Cho pointed out that big names like Google, Microsoft, and Facebook use NMT, and sites like Booking.com and even the European Patent Office have all caught the NMT bug.


“So it’s mainstream,” Cho concluded. He added though, that research was continuous and ongoing, despite existing NMT systems outperforming statistical models that have been in place and aided by improvements for over ten years.

“所以,NMT已成为主流。” Cho总结道。他还补充说,尽管现有的NMT系统已经超越了历经10多年改进的统计模型,但关于NMT的研究仍在继续。

“Somehow Nobody has Tried It”


The key difference lies in how Cho and fellow researchers approached the problem. “So far a lot of the research on machine translation has been focused on sub-word level translation,” Cho said. “That is looking at a sentence as a sequence of sub-words.”

Cho及其同事和其他研究员的关键不同在于如何处理这个问题。“目前为止,很多机器翻译的研究都聚焦于子词级翻译,”Cho 表示,“就是把句子看作一序列子词。”

Cho and his co-researchers decided to go down to character-level modelling.

Cho 和他的合作研究员决定建立字符级模型。

“In 2016 we decided to try it out; somehow nobody has tried it,” he said. “When a new technology comes in, what everyone tries to do is use the new technology to mimic what you were able to do with the old technology. So everyone was stuck with morpheme-level or word-level modelling and then somehow forgot to try this new technology on new ways of representing a sentence, that is view it as a sequence of characters.” And the results were telling.

“2016年,我们决定试试看,不知道为什么,之前没有人尝试。”Cho 说,“当一项新技术出现时,每个人都试图使用新技术来模仿旧技术所能做的事情。所以大家都局限于语素级或字词级的模型中,而忘记将新技术用作处理句子的新方法,即将句子看作一序列字符。”结果说明了一切。

Record Breaking


“This model beats any single paired model you can think of,” Cho said, reporting how the NMT system performed either on the same level or—and often—better than existing MT models when assessed through BLEU (bilingual evaluation understudy) scores or even human evaluation.


Cho also highlighted some other advantages to NMT aside from better quality, such as its robust handling of spelling mistakes and morphology. Another pleasant surprise was how the NMT system can translate into compound words that rarely appear in a training corpus the size of 100 million words.


One breakthrough in particular was quite promising: the NMT system can translate into a desired target language even without knowing the source language.


Cho’s team trained their NMT system to translate from German, Czech, Finnish, and Russian to English. They then tasked the system to translate any given sentence into English without providing a language identifier.


“The decoder doesn’t care which source language it was written in, it’s just going to translate into the target language,” Cho said. “Now, since our model is actually of the same size as before, we are saving around four times the parameters. Still, we get the same level or better performance.”


They took the experiment a step further and fed the system a sentence written in three different languages. The system did the translation without any external indication which part of the sentence was written in which language, proving the model automatically learns how to handle code-switching within a sentence.


Finally, Cho touched on low resource languages. What his team and other NMT research teams across the globe have found is that as their system learns shared similarities across languages, it can actually apply learnings from high resource languages to low resource ones and improve their translation.


The Future is “Extremely Fast-Moving”


Cho saved cutting-edge for last: non-parameter NMT. He says this system translates the way a human translator would: by leveraging translation memory (TM) as an on-the-fly training set.


This way, the NMT system acts like a translator and does not need an entire training corpus in its database, but instead accesses relevant TMs to translate. Cho commented that this system actually displays higher consistency in style and vocabulary choice.


Finally, Cho closed his presentation on state-of-the-art NMT by explaining the future direction of NMT research.


First, low resource language translation is a priority. Second, he said there is already some body of work on zero resource translation. The third and last direction is better handling of Chinese, Japanese, and Korean translation.


Later on in the panel session, Cho fielded a question about the biggest challenge in NMT.


He said hundreds of people have been working on MT for over 30 years, and research on NMT has been going on for about three years. “It’s only the apparent disruption you see,” Cho said, explaining that it will be hard to tell what kind of disruption will result from incremental advances in research.


“Even if I can tell you the challenges that I’m working on at the moment, that probably won’t tell you or anybody how the next disruption is going to happen,” he said.


Pondering how fast these breakthroughs make it to market, May Habib, CEO, Qordoba, asked after the presentation how long it takes between research breakthrough and deployment in the field.

Qordoba公司总裁May Habib关心这些突破性进展多快可以面市,于是在Cho做完陈述后问道从研究实现突破到实际应用需要多长时间。

Cho pointed out that they published their first paper on NMT in 2015, and the first big commercial announcement regarding application was from Google Translate in September 2016. He added that though Google did not disclose details of their deployment, Facebook still managed to launch their own NMT system a year later.


“It’s an extremely fast-moving field in that every time there is some change, we see the improvement,” Cho said. “So you gotta stay alert.”



neural machine translation (NMT)  神经网络机器翻译

the European Patent Office  欧洲专利局

statistical model  统计模型

sub-word level  子词级

character-level  字符级

morphology  词法

single paired model  单一配对模型

compound words  合成词

source language  源语言

target language  目标语;目的语

decoder  解码器;译码器;译码员

parameter  参数;系数;参量

code-switching  代码转换

low resource languages  低资源语言

high resource languages  高资源语言

zero resource translation  零资源翻译

on-the-fly  〈非正式〉匆匆忙忙地;在空中;飞行中;[计] (计算机)运行中,动态的

training set  教练组;[计] 训练集;训练区

translation memory (TM)  翻译记忆

corpus  [计] 语料库;文集;本金



双语 | 牛津大学预测“7年后机器翻译超越人类”,靠谱吗?

双语 | 东京大学教授:有了人工智能翻译,还用学英语吗?

前沿 | 人工智能读一百万遍《射雕英雄传》能写出靖哥哥的故事吗?

讲堂 | 译讲堂:无论人工智能如何发展 译者都不会失业

双语•智库 | 大多数美国人害怕“人工智能”,你呢?

