查看原文
其他

人类一说话,上帝都害怕——从语言学窥探人类心灵

张晓雨、王锡尊 全球知识雷锋 2021-06-17


为什么我们会进化出一种使我们容易窒息的口腔和喉咙?一个看似合理的假设是,这是在进化过程中达成的一种妥协,来允许我们说话。


查尔斯·达尔文写道:‘人有一种本能的说话倾向——我们会看到幼童呀呀学语,而没有孩子会有烘焙、酿造或写作的本能。’”


本文为全球知识雷锋第62篇讲座,

本次讲座于2011年9月于The Floating University举行,由哈佛大学语言学教授Steven Pinker主讲,原题为Linguistics as a Window to Understanding the Brain,由东南大学张晓雨和哈尔滨工业大学王锡尊总结整理,由马里兰大学语言学博士杨雨岸校注并推荐。






记录者:张晓雨

东南大学风景园林专业大三在读




记录者:王锡尊

哈尔滨工业大学风景园林专业大三在读





主讲人:史蒂夫·平克

(Steven Pinker)

加拿大裔美国心理学家、认知科学家、语言学家和科普作家,哈佛大学荣誉教授,代表作《语言本能》、《心智探奇》等。2004年被《TIME》评选为世界上最有影响的100人之一,在《Prospect》杂志评选出的“2013年全球顶尖思想家”排名第三。


推荐人:杨雨岸

清华大学外文系本科,目前马里兰大学语言学博士在读。

研究儿童语言习得,方向主要为理论语义学和语义习得。


文章全长7352字,阅读完需要10分钟



推荐语

本篇讲座由马里兰大学语言学博士杨雨岸推荐


每次入境检查时,边防警官看到我的专业是linguistics 都会问我 How many languages do you speak?这个问题总让我很尴尬:第一,我实际上只会两门语言;第二,语言学研究并非语言学习——语言学更像是编写语法讲义。而这个语法讲义要用最简单的规则概括最多的语言现象,并留下最少的特例。


为什么要简练呢?这涉及到本次讲座中Pinker教授提到的语言学中最重要的问题:儿童是如何通过少量的语言输入在短时间内获得母语的?虽然无数学者试图证明这个问题并不存在,但是在乔姆斯基提出这个问题43年后,人们仍然无法完全否认它的有效性。


举个简单的例子:让我们来看下面这四句话。

(1)Mary gave John a book.

(2) Mary gave a book to John.

(3)*Mary donated John a book.[1]

(4)Mary donated a book to John.

[1] 在语言学中不合法的句子用星号标出。


其中,(1)和(2)句符合英文语法,而(3)句则不合语法。Give和donate的意思相似,为什么 give 既可以用在 __someone something 这个句型里,又可以用在 __something to someone 这个句型里呢?


不知道大家有没有这样的印象:每当我们问英语老师为什么的时候,老师总是说“固定搭配”。这就是学语言的思维:当遇到“为什么”时,记住用法最重要——毕竟大部分人学语言的目的是和人交流。


但是语言学就会追根究底:如果儿童碰到give这种动词,记住了它的两种用法,那他们碰到 donate 的时候,会不会把这两种用法都用上呢?如果他们不会,那么他们是怎么学到 donate 和 give 是不一样的呢?更要命的是,万一他们一辈子都碰不到 donate 这个词,难道他们的语法里就没有这个规律了吗?Pinker教授最早期的发现之一就是,give这一类动词源于日耳曼语,而donate这一类词源于拉丁语,而英语母语者(虽然说不上为什么)能够区分这两类动词。当然就像每一个研究一样,这个结果也被后面的学者不断挑战,但在这个过程中,我们对英语动词的理解越来越深刻(Pinker 1989, Tomasello 1992; for a more recent discussion of the debate, see Yang 2016)


而本次Pinker教授的讲座也能让大家进一步理解,为什么“语言学”不(完全)是“学语言”。


推荐语参考文献:

Pinker, S. (1989) Learnability and cognition: The acquisition of argument structure. MIT Press

Tomasello, M. (1992) First verbs: A case study of early grammatical development. Cambridge University Press.

Yang, C. (2016) The Price of Linguistic Productivity: How Children Learn to Break the Rules of Language. MIT Press.



导读

你每天都在使用语言,可是你是否想过,你为什么会在不同的场合、面对不同的对象使用不同的语言?人类是怎样理解语言的?语言在大脑中如何转化?从出生到长大成人,你的语言理解水平经历了怎样的阶段?……这些都是语言学家在研究的问题。而微观语言学(音系学、词汇学、句法、语义等)是研究这些宏观问题的基础。


语言的研究也有许多实际应用。你知道Google翻译是如何实现的吗?这要得益于乔姆斯基研究的句法结构——语言学和计算机科学结合的实例。


为什么有些书你死也看不进去,为什么一些文学作品你读了之后会有“恐怖”、“高兴”等感受?实际上,作者的选词和句型结构不同会导致读者产生不同的感受——这就是一个语言学和文学结合的实例。


这仅是语言学运用的一小部分。其他还有与人类学的结合、与社会学结合、与符号学结合、与翻译结合、与医学结合治疗失语症……只要有语言的地方,几乎都有语言学的影子。语言学家的研究意义就在于,发现这些你没有意识到的、平常在运用的内在规律,从而指导人类进步。


语言永远是迷人的,它是人类最独特的天赋,是窥探人性的窗口;最重要的是,语言的巨大表现力是自然界的奇迹之一。


结构导图



正文

史蒂夫·平克(Steven Pinker)


我叫史蒂夫·平克(Steven Pinker),是哈佛大学心理学教授。今天我要和你们谈谈语言。其实我不是语言学家,而是一名认知科学家。我对语言的兴趣不在其本身,而是视之为 “观察人类心灵的窗口”。


在人文科学领域,“语言” 是一个基本话题——它是区分人类和其他物种的最明显的特质,是实现合作的基础。通过分享我们的知识、用言语协调我们的行动,人类取得了惊人的成就;这也引出了一系列科学上的问题,比如,语言是怎么在特定物种——人类中演变的?大脑如何处理语言信息?同时语言也有许多实际应用,这一点不足为奇,毕竟它对人类生活至关重要。


掌握语言是如此轻松自然,以至于我们忘记了它的神奇之处。但试想一下接下来一个小时你将要做的事情:从我呼吸的杂音中耐心倾听。


你为什么要这样做?不是因为我的声音特别悦耳,而是因为我在这一系列乱七八糟的声音里编译了信息。


从这种噪音流中获得信息的能力使我们能够分享想法。现在,我们分享的想法是 “语言”;但只要我对发出的声音做一些细微的改变,我就可以让你联想到其他事物——从你喜欢的真人秀到宇宙的起源,无所不包。


我认为这就是语言的神奇所在——它巨大的表现力。即使在我研究语言学35年之后,语言的表现力之强仍然使我惊叹。这也是语言科学旨在解释的主要现象。



语言学:语言的科学

语言无疑是人类生活的核心。人类之所以能取得巨大的成就,正是因为我们通过语言媒介交换知识和意图。语言并不为某种文化所特有,而出现在人类学家研究过的所有社会中。地球上有大约6000种语言,每一种都很复杂——目前发现的所有人类社会,都有着复杂的语言体系。


基于这个现象和其他一些原因,查尔斯·达尔文写道:“人有一种本能的说话倾向——我们会看到幼童呀呀学语,而没有孩子会有烘焙、酿造或写作的本能。”


查尔斯·罗伯特·达尔文(Charles Robert Darwin) ,英国生物学家,进化论的奠基人。


语言是一种复杂的天赋,语言学自然也是一门复杂的学科。它包括对语言本身的研究:语法(grammar)[2]、音系学(phonology)[3]、语义学(semantics)[4],语用学(pragmatics)[5]。

[2] 语法:词汇、短语和句子的组合规则。

[3] 音系学:对语音系统的研究。

[4] 语义学:对意义的研究。

[5] 语用学:语言在对话中的研究。


除了语言本身,对语言学感兴趣的科学家们还会研究心理语言学(psycholinguistics)[6] 语言习得(language acquisition)[7] 以及神经语言学(neurolinguistics)[8] 

[6] 心理语言学:研究语言是如何实时转化的。

[7] 语言习得:儿童如何掌握一门语言。

[8] 神经语言学:语言是如何在大脑中转化的。




“语言”不是什么

在我们开始之前,首先注意不要把语言与其他三种与之关系密切的东西相混淆。


· 书写系统 ≠ 语言


不同于贯穿于人类历史始终的 “口头语言” ,“书写系统” 的历史非常短,只有大约5000年。并且在人类历史上,一个字母代表一个元音或者辅音的书写方式只出现过一次,大概在3700年前由迦南人(Canaanites)发明。正如达尔文所言,孩子们没有本能的写作倾向,他们必须通过学校教育来掌握书写。


· 正确的语法 ≠ 语言


语言学家将语法分为两种:描述性语法(descriptive grammar)[9]和规定性语法(prescriptive grammar)[10]。语言学的一个小秘密是,这两种语法不仅不同,而且相当一部分规定性语法完全说不通。

[9] 描述性语法 : 人们约定俗成的说话方式。

[10] 规定性语法 : 规定人们在严谨写作时应该如何组织语言。


比如柯克船长(Captain Kirk)说企业号的使命 “to boldly go where no man has gone before(勇敢地去没有人去过的地方)”时,他其实犯了一个严重的语法错误——不定式to do中间不能插入副词;他应该说 “to go boldly where no man has gone before(勇敢地去没有人去过的地方)”——然而这样与英语的韵律节奏又产生了冲突。

Take one of the most famous of these rules, the rule not to split infinitives.  According to this rule, Captain Kirk made a grievous grammatical error when he said that the mission of the Enterprise was “to boldly go where no man has gone before.” He should have said, according to these editors, “to go boldly where no man has gone before,” which immediately clashes with the rhythm and structure of ordinary English.



实际上,这条语法是强行模仿拉丁语制定的。然而,拉丁语中的不定式就是一个词,根本没法插入一个词——比如 facere[11]——即使是尤利乌斯•凯撒[12] 也不能把它分开。这条规则被生搬硬套到英语里,根本不符合实际。

[11] facere : 拉丁语中的to do。

[12] 尤利乌斯•凯撒 : 罗马共和国末期杰出的军事统帅、政治家。

In fact, this prescriptive rule was based on a clumsy analogy with Latin where you can’t split an infinitive because it’s a single word, as in facere- to do. Julius Caesar couldn’t have split an infinitive if he wanted to. That rule was translated literally over into English where it really should not apply.



另一个语法规定是所谓的双重否定[13]。米克·贾格尔(Mick Jagger)不应该唱 I can’t get no satisfaction(我无法得到满足) ,他应该唱 I can’t get any satisfaction(我无法得到满足) 。这个规定甚至被视为合乎逻辑要求。但是实际上, can’t  和 any 就像  can’t 和 no 一样是双重否定的。 can’t…any… 被视为正确而 can’t…no… 不合语法,唯一的原因就是17世纪时英国南部说的是 can’t…any…  ;如果当时英国首都在北方的话,那么 can’t… no…  就会是现在的正确用法。

[13] 双重否定:或否定叠加。

Another famous prescriptive rule is that, one should never use a so-called double negative. Mick Jagger should not have sung, “I can’t get no satisfaction,” he really should have sung, “I can’t get any satisfaction.” Now this is often promoted as a rule of logical speaking, but “can’t” and “any” is just as much of a double negative as “can’t” and “no.”  The only reason that “can’t get any satisfaction” is deemed correct and “can’t get no satisfaction” is deemed ungrammatical is that the dialect of English spoken in the south of England in the 17th century used “can’t” “any” rather than “can’t” “no.” If the capital of England had been in the north of the country instead of the south of the country, then “can’t get no,” would have been correct and “can’t get any,” would have been deemed incorrect.



一种语言被选作一个国家的标准用语并没有什么特别之处,标准用语和方言都有他们各自的复杂性。举个例子,美国黑人英语[14] 中有一个用法  He be working ——这并不是一个错误或者对标准英语的错用,实际上,它传达了一个微妙的区别。与 He working 不同, He be working 意味着他 “被雇佣” ,有一份工作;而 He working 则表示,在你我讲话的那一刻,他正在工作。

[14] 美国黑人英语即非裔美国人的白话英语(African-American vernacular English),也被称为黑人英语(Black English)或埃伯尼语(Ebonics)。

There’s nothing special about a language that happens to be chosen as the standard for a given country. In fact, if you compare the rules of languages and so-called dialects, each one is complex in different ways. Take for example, African-American vernacular English, also called Black English or Ebonics.  There is a construction in African-American where you can say, “He be workin,” which is not an error or bastardization or a corruption of Standard English, but in fact conveys a subtle distinction, one that’s different than simply, “He workin.”  “He be workin,” means that he is employed; he has a job, “He workin,” means that he happens to be working at the moment that you and I are speaking.


这是标准英语和黑人英语之间一个时态区别,证明方言的微妙复杂程度并不逊色于标准用语。这样的例子还有很多。




· 想法 ≠ 语言


很多人说他们会用语言思考,但是心理学研究表明,有很多想法并不以语句的形式产生。


例如,婴儿(和其他动物)在没有言语的情况下的交流。我们从实验中了解到,非语言生物[15]都有着微妙的认知能力;就算没有语言的辅助,他们也能表达因果意图等。

[15] 非语言生物:比如还不会说话的婴儿,或者其他动物。

(拓展阅读:一岁幼童也会推理:逻辑产生或与语言无关 | 前沿



还有 “不牵涉语言的思考类型”。我们知道,即使是会说话的成年人,很多思考也不是以语言的形式进行的。例如,视觉图像。看下图右侧最上面的两个图形,它们的形状一样吗?你不会通过 “用语言描述这些立方体” 的方式来解决这个问题,而是会在脑海中旋转一个图形来匹配另外一个——这就是一种非语言思维(non-linguistic thinking)。



认知心理学的又一重大发现是,我们会使用隐性知识(tacit knowledge)来理解语言并记住要点——当你理解了一段话的时候,你并不是逐字逐句记住它们;人的长期记忆储存的是要点或内涵,而不是具体的词句。例如,我相信在这十分钟里,你记住了一些我所说的内容,但是如果我让你复述任何一句我的“原话”,估计几乎没有人能做到。记忆中的东西远比确切的语句抽象,我们可以管他们叫意义或语义(semantics)


事实上,“确切的语句只是最表层的一部分。当我们理解句子时,其实经历了一个非常迅速、无意识、非语言的处理过程,语言本身也因为它们才有意义。


我们来看一下这个经典例子——洗发水的说明:“打湿头发,揉出泡沫,冲洗泡沫,重复。”


老友记S02E06


在理解这个非常简单的说明时,你自然而然地就会明白,第二次洗的时候你不需要再次打湿头发,因为它已经湿了;你也不会一遍又一遍地重复、无限循环下去,因为你知道这里的“重复”是“重复一次”的意思。这些 “隐性知识” 是理解说明书意图的关键,但它们本身并不是语言。

And I’ll illustrate this with a classic bit of poetry, the lines from the shampoo bottle. “Wet hair, lather, rinse, repeat.” Now, in understanding that very simple snatch of language, you have to know, for example, that when you repeat, you don’t wet your hair a second time because its already wet, and when you get to the end of it and you see “repeat,” you don’t keep repeating over and over in infinite loop, repeat here means, “repeat just once.”  Now this tacit knowledge of what the writers of that bit of language had in mind is necessary to understand language, but it itself, is not language.


最后,退一步说,如果人们无法脱离语言而思考,那么语言从何而来?假设语言即思想,那么就会出现上述这个悖论——毕竟英语不是被什么火星委员会设计然后专门跑来地球送给我们的。相反,语言是一种底层现象(grassroots phenomenon)。语言可以说是最原始的维基百科,它集合了成百上千的人的创造——术语、俚语和新的结构;随着人们寻找新的表达思想的方式,各种用法愈积愈多——这就是语言的源起。




说了这么多,并不是为了否认语言会影响思想,语言学家们一直对语言相对论假说(the linguistic relativity hypothesis)[16]——即语言决定思维——非常感兴趣。虽然语言相对论假说的地位还相当有争议,但是没有人相信语言就等同于思想、我们的精神生活就是背诵句子。

[16] 语言相对论假说,或者说是萨丕尔-沃尔夫假说(the Sapir-Whorf Hypothesis),以最初制定它的两位语言学家的名字命名。



“语言”是什么

我们已经清楚了语言不是什么,那么让我们从语言是如何运作开始,谈谈语言到底是什么。



简而言之,你可以将语言分成三个部分。


第一部分是词汇。作为句子的基本组成部分,它们被储存在长期记忆中,我们可以称它们为心理词汇(mental lexicon)或心理词典(mental dictionary)。


第二是规则,它类似于“食谱”或者“算法”,使我们能够把“语言”组织成“更复杂的语言”。规则包括句法(syntax)[17]、形态学(Morphology)[18]以及音系学(Phonology)[19]

[17] 句法:从词到短语再到完整的句子。

[18] 形态学:给简单的词根加上前缀和后缀,组成更复杂的单词。

[19] 音系学:将元音和辅音结合成为最小的单词。


第三部分是接口(interfaces)。所有这些语言知识通过 “接口” 与现实连接,让我们可以理解、产出话语,达到互相交流的目的。


· 词汇(words)


我们从词汇开始。100多年,瑞士语言学家费迪南德·索绪尔(Ferdinand de Saussure)注意到了字符的任意性,从而提出了词汇的基本原理。


费尔迪南·德·索绪尔(Ferdinand de Saussure),瑞士作家、语言学家,结构主义创始人,现代语言学理论的奠基者。


举个例子, duck(鸭子) 这个词,它长的不像只鸭子,也不像鸭子一样走路,当然也不会嘎嘎叫;但我可以用它来让你联想到一只鸭子,因为我们所有人都在某个时间点中记住了那种叫声与这个字含义之间的强力联系——这意味着它必须以一种非常简单的形式、按照某种格式存储在我们的大脑中。

心理词典中的条目可能看起来像这样:有一个符号代表这个词,有它的发音标准,并且对应着某种特定的含义。

Take for example the word, “duck.”  The word”duck” doesn’t look like a duck or walk like a duck or quack like a duck, but I can use it to get you to think the thought of a duck because all of us at some point in our lives have memorized that brute force association between that sound and that meaning, which means that it has to be stored in memory in some format, in a very simplified form. And an entry in the mental lexicon might look something like this.  There is a symbol for the word itself, there is some kind of specification of its sound and there’s some kind of specification of its meaning.

心理词典的一个显著特征就是 “容量巨大”。使用词典抽样技术,让被测者选择它们的正确词义,校正模糊猜测的误差,再乘以词典的总页数,可以得到一个正常高中毕业生的词汇量约为6万。也就是说,从一岁开始,一个人每两个小时就能学习一个新单词。想想你永远记不住的电话号码或者历史年代,你大概就能感受到人类长期记忆在存储词汇发音和含义方面的巨大容量了——这些单词也是任意的啊。


· 规则——语法( grammar)


当然,不只是讲单字,我们会将它们结合成短语和句子。这就涉及到语言的第二个主要组成部分——语法。


现代语法的研究与一位语言学家的贡献密不可分——著名学者诺姆·乔姆斯基(Noam Chomsky),语言学领域过去60年的发展都与他密切相关。 


诺姆·乔姆斯基(Noam Chomsky),麻省理工学院荣誉教授,有史以来全球论文被引用数量第 8 的学者。




首先,乔姆斯基认为,我们在理解语言时必须解释的难题是“创造力(creativity)”——也就是语言学家所说的“能产性(productivity)”问题,即 “产出和理解新句子的能力”。因为除了少数惯用语外,几乎所有你说的和理解的句子都是你这辈子第一次见到、甚至是人类历史上第一次出现的。人们是如何做到这一点的?

To begin with, Chomsky noted that the main puzzle that we have to explain in understanding language is creativity or as linguists often call it productivity, the ability to produce and understand new sentences. 

Except for a small number of clichéd formulas, just about any sentence that you produce or understand is a brand new combination produced for the first time perhaps in your life, perhaps even in the history of the species.  We have to explain how people are capable of doing it. 


这个事实意味着,当我们掌握一种语言时,我们不是记住了一长串句子,而是内化了这其中的语言规则,并学会运用这些规则来组织语言单元。出于这个原因,乔姆斯基坚持认为,语言学是心理学的一个分支,是观察人类思维的一个窗口。

It shows that when we know a language, we haven’t just memorized a very long list of sentences, but rather have internalized a grammar or algorithm or recipe for combining elements into brand new assemblies. For that reason, Chomsky has insisted that linguistics is really properly a branch of psychology and is a window into the human mind.


其次,乔姆斯基指出,语法和意义可以毫无关联。举个例子,乔姆斯基在1956年所说的  Colorless, green ideas sleep furiously(无色绿思狂怒睡) 几乎毫无意义,然而任何一个英语使用者都会同意这句话是符合语法的;可是,当面对另一个同样没有意义的句子, furiously sleep ideas dream colorless(疯狂地睡思想无色) 时,我们就会挑刺说这句话语词杂拌(word salad)[20] 了。

[20]  语词杂拌:言语缺乏思维联想内容和结构上的连贯性和逻辑性,不仅句子间没有联系,而且言语中的细微语法结构也出现混乱,词组或词之间没有联系,形成互不相关的词的堆砌。

A second insight is that languages have a syntax which can’t be identified with their meaning.  Now, the only quotation that I know of, of a linguist that has actually made it into Bartlett’s Familiar Quotations, is the following sentence from Chomsky, from 1956, “Colorless, green ideas sleep furiously.”  Well, what’s the point of that sentence? The point is that it is very close to meaningless. On the other hand, any English speaker can instantly recognize that it conforms to the patterns of English syntax. Compare, for example, “furiously sleep ideas dream colorless,” which is also meaningless, but we perceive as a word salad.


最后,乔姆斯基特别指出,句法不是人们对刺激的反应,也远非字与字的关联。在此之前,心理学的刺激反应理论(stimulus response theories)认为,一个句子的第一个词的产出是对初始刺激的回应,而句子中接下来的词又是对前一个词的回应,这样一个又一个接连蹦出单词组成句子。

A third insight is that syntax doesn’t consist of a string of word by word associations as in stimulus response theories in psychology where producing a word is a response which you then hear and it becomes a stimulus to producing the next word, and so on.


让我们再回到这句话, Colorless, green ideas sleep furiously(无色绿思狂怒睡) ,如果你逐字分析这个句子,无论是 green(绿) 和 ideas(思),ideas(思)和sleep(睡),还是sleep(睡)和furiously(怒),这些词语之间都毫无关联。这句话中的任意两个词同时出现的几率都几乎为零,但是这不妨碍整个句子结构的语法正确性。

Again, the sentence, “colorless green ideas sleep furiously,” can help make this point.  Because if you look at the word by word transition probabilities in that sentence, for example, colorless and then green; how often have you heard colorless and green in succession.  Probably zero times. Green and ideas, those two words never occur together, ideas and sleep, sleep and furiously. Every one of the transition probabilities is very close to zero, nonetheless, the sentence as a whole can be perceived as a well-formed English sentence.


一般而言,语言具有长距离依存性(long distance dependencies),句子前面用的词可能会决定后面该用什么词。例如,如果一个句子用 “either” 开头,那后面就肯定会有一个 or ;如果开头用了 if ,后面一般就会有个 then 。孩子可以对爸爸说:“Daddy, what did you bring that book that I don’t want to be read to out of, up for(爸爸,你把我不想读的那本书拿来做什么)?”在这个句子里面,就包含了多个长距离依存的关系。

Language in general has long distance dependencies. The word in one position in a sentence can dictate the choice of the word several positions downstream. For example, if you begin a sentence with “either,” somewhere down the line, there has to be an “or.” If you have an “if,” generally, you expect somewhere down the line there to be a “then.” There’s a story about a child who says to his father, “Daddy, why did you bring that book that I don’t want to be read to out of, up for?”  Where you have a set of nested or embedded long distance dependencies.


如果一个句子里面有太多的长距离依存关系,读者或者听者的短期记忆压力增大,理解难度自然就会提升。这是语言学在写作领域的众多应用之一。

Indeed, one of the applications of linguistics to the study of good prose style is that sentences can be rendered difficult to understand if they have too many long distance dependencies because that could put a strain on the short-term memory of the reader or listener while trying to understand them.


与其说句子是由词-词关联的集合组成,不如说它其实有一个倒置的、树一般的层次结构。举例来说,英语有一条基本的规则,即 “一个句子一般由一个作主语的名词短语和一个作谓语的动词短语组成”。我们还可以利用另一条规则转写动词短语,即动词作宾语的名词短语,另外可以加上一个作补语的句子,最后我们得到 I/ told /him /that it was sunny outside(我/告诉/他/外面阳光明媚)。 

Rather than a set of word by word associations, sentences are assembled in a hierarchical structure that looks like an upside down tree. Let me give you an example of how that works in the case of English. One of the basic rules of English is that a sentence consists of a noun phrase, the subject, followed by a verb phrase, the predicate. A second rule in turn expands the verb phrase.  A very phrase consists of a verb followed by a noun phrase, the object, followed by a sentence, the complement as “I told him that it was sunny outside.”



那么为什么语言学家坚持认为语言必须依照短语结构规则(phrase structural rules)来组织呢?

Now, why do linguists insist that language must be composed out of phrase structural rules?


一来它能解释我们之前提到的现象——语言的无限创造力。

它允许我们表达新的含义。例如,新闻界有这样一个段子:狗咬人,不是新闻,人咬狗才是新闻。语法的美妙之处在于,通过全新的组合方式,即使是熟悉的单词也可以传达出新鲜感。

Well for one thing, that helps explain the main phenomenon that we want to explain, mainly the open-ended creativity of language.

It allows us to express unfamiliar meanings. There’s a cliché in journalism for example, that when a dog bites a man, that isn’t news, but when a man bites a dog, that is news. The beauty of grammar is that it allows us to convey news by assembling familiar word in brand new combinations. 


此外,通过语法的多种组合,我们能表达非常多的想法,理论上来说是无限的——当然,由于人没有无限的寿命,无法真正展示人能理解无限的句子”,但我们可以用另一个方式来证明:懂算术的人都知道,世界上有无穷个数字;如果任何人声称他发现了世界上最大的数,你总可以把这个数加“1”来打破这个宣言;语言也是同理

Also, because of the way phrase structure rules work, they produce a vast number of possible combinations. Moreover, the number of different thoughts that we can express through the combinatorial power of grammar is not just humongous, but in a technical sense, it’s infinite.  Now of course, no one lives an infinite number of years, and therefore can show off their ability to understand an infinite number of sentences, but you can make the point in the same way that a mathematician can say that someone who understands the rules of arithmetic knows that there are an infinite number of numbers, namely if anyone ever claimed to have found the longest one, you can always come up with one that’s even bigger by adding a one to it. And you can do the same thing with language.



事实上,世界上最长的句子确实是存在的。谁敢夸这样的海口呢?当然是吉尼斯纪录了。你可以去查一下,世界上最长的句子出现在威廉·福克纳(William Faulkner)的一本小说里,一句话中共含有1300个词。让我们读一下它的开头:“They both bore it as though in deliberate flatulent exaltation(他们都觉得好像在蓄意夸大其辞)……”

As a matter of fact, there has been a claim that there is a world’s longest sentence. Who would make such a claim? Well, who else? The Guinness Book of World Records. You can look it up.  There is an entry for the World’s Longest Sentence. It is 1,300 words long. And it comes from a novel by William Faulkner.  Now I won’t read all 1,300 words, but I’ll just tell you how it begins.  

“They both bore it as though in deliberate flatulent exaltation…” and it runs on from there.



然而现在我要告诉你,下一秒这就不是世界上最长的句子了;我完全可以在原句之前加一句 Faulkner wrote福克纳写道) ,世界上最长的句子就变成了 Faulkner wrote, they both bore it as though in deliberate flatulent exaltation(福克纳写道,他们都觉得好像在蓄意夸大其辞…… 

But I’m here to tell you that in fact, this is not the world’s longest sentence. And I’ve been tempted to obtain immortality in Guinness by submitting the following record breaker.  "Faulkner wrote, they both bore it as though in deliberate flatulent exaltation.” 


遗憾的是,这条记录最多也就能保持十五分钟,因为你已经学会了这招。你可以说 Guinness noted that Faulkner wrote(据吉尼斯纪录记载,福克纳写道)…… 或者 Pinker mentioned that Guinness noted that Faulkner wrote平克说,据吉尼斯纪录记载,福克纳写道)…… 。

But sadly, this would not be immortality after all but only the proverbial 15 minutes of fame because based on what you now know, you could submit a record breaker for the record breaker namely, "Guinness noted that Faulkner wrote" or "Pinker mentioned that Guinness noted that Faulkner wrote", or "who cares that Pinker mentioned that Guinness noted that Faulkner wrote…"


一个相同的句子,如果你的断句不同,也会有完全不同的意思。这一点在有歧义的句子里尤为突出。

This is best illustrated in ambiguous sentences. Sentences where the same string of words can be grouped into phrases in different ways, each of which has a different meaning.


举个例子,看这条节目预告: On tonight’s program, Conan will discuss sex with Dr. Ruth(在今晚的节目中,柯南将和露丝博士讨论 “” )。 这句话本来是很纯洁的, discuss 后面跟着的是“将要讨论的话题‘sex’”和参与讨论的人'Dr.Ruth'”。但是如果你皮这么一下,重新断个句, sex with Dr. Ruth 就成了讨论的主题。

Take for example, the following wonderfully ambiguous sentence that appeared in TV Guide.

“On tonight’s program, Conan will discuss sex with Dr. Ruth.”  Now this has a perfectly innocent meaning in which the verb “discuss” involves two things, namely the topic of discussion,“sex” and the person with who it’s being discussed, in this case, with Dr. Ruth. But is has a somewhat naughtier meaning if you rearrange the words into phrases according to a different structure in which case “sex with Dr. Ruth” is the topic of conversation, and that’s what’s being discussed.



所以说短语结构不仅能解释我们为什么能创造那么多不同的句子,它对我们理解句意也至关重要。主谓宾的结构层级” 帮助我们弄清楚到底是 “” 对 “” 做了 “什么”。

Now, phrase structure not only can account for our ability to produce so many sentences, but it’s also necessary for us to understand what they mean. The geometry of branches in a phrase structure is essential to figuring out who did what to whom.


乔姆斯基对语言科学的另一个重要贡献是对儿童语言习得的关注。如上所述,学习一门语言并不是记住一大堆句子,儿童们也并不是单靠记忆掌握语言的——他们自幼时就开始从他们父母的话语中提炼语法规则。从他们开始说话的那一刻开始,创造新句子的天赋就开始展露了——孩子可以说出他们从未听过的句子。

Another important contribution of Chomsky to the science of language is the focus on language acquisition by children. Now, children can’t memorize sentences because knowledge of language isn’t just one long list of memorized sentences, but somehow they must distill out or abstract out the rules that goes into assembling sentences based on what they hear coming out of their parent’s mouths when they were little.  And the talent of using rules to produce combinations is in evidence from the moment that kids begin to speak.


双词句阶段,也就是孩子们大约18个月大的时候,他们开始能够说出最短的、两个单词长的句子;很明显,他们是动用了自己的小脑瓜把它们组合在一起的。

比如,孩子们有时候说, more outside多外面) ,意思是“带我出去”,或者“让我待在外面”;但是成年人不会说 more outside ,所以这种话不是孩子背下来的,而是他们使用规则创造的新的词语组合。

再比如,一个孩子洗掉手指上的果酱,对他的母亲说:“all gone sticky粘的都没了)”;同样,这也不是一个可以从父母那学来的短语,而是一个儿童自创的短语。

At the two-word stage, which you typically see in children who are 18 months or a bit older, kids are producing the smallest sentences that deserve to be counted as sentences, namely two words long. But already it’s clear that they are putting them together using rules in their own mind. To take an example, a child might say, “more outside,” meaning, take them outside or let them stay outside. Now, adults don’t say, “more outside.”

So it’s not a phrase that the child simply memorized by rote, but it shows that already children are using these rules to put together new combinations. Another example, a child having jam washed from his fingers said to his mother 'all gone sticky'. Again, not a phrase that you could ever have copied from a parent, but one that shows the child producing new combinations.



儿童对过去式的使用能很好的说明,他们从一开始说话就无意识地使用了语法规则。例如,儿童会经过一个“过度概括”的发展阶段。比如,他们会说“我们拿了(holded)小兔子”或者“他撕了(teared)纸,然后把它粘起(sticked)来。”这个例子里,他们过度使用了 “加ed形成动词过去式” 这一规律。

An easy way of showing that children assimilate rules of grammar unconsciously from the moment they begin to speak, is the use of the past tense rule. For example, children go through a long stage in which they make errors like, “We holded the baby rabbits” or “He teared the paper and then he sticked it.”Cases in which they over generalize the regular rule of forming the past tense, add ‘ed’ to irregular verbs like “hold,” “stick” or “tear.”


Wug 测试能很有效地测试儿童对语法规则的使用。把孩子带到实验室,给他们看一张小鸟的照片,告诉他们说:“这是一个 wug 。”然后给他们看另一张照片,说:“现在有两个了。”还没等你说完有两个,孩子们会抢答道 wugs 。他们不可能靠记忆回答,因为这个词是为了实验而编造的。这个实验表明他们已经能够掌握英语的一般复数规则。

拓展阅读:人生的第一句话怎样说出来 | 36个月

And it’s easy to show… it’s easy to get children to flaunt this ability to apply rules productively in a laboratory demonstration called the Wug Test. You bring a kid into a lab.You show them a picture of a little bird and you say, “This is a wug.”And you show them another picture and you say,“Well, now there are two of them.” There are two and children will fill in the gap by saying “wugs.”Again, a form they could not have memorize because it’s invented for the experiment, but it shows that they have productive mastery of the regular plural rule in English.



乔姆斯基有一个著名的观点:儿童天生即有一套反映人类语言共性的普遍语法(universal grammar)[21],这样就解决了语言习得的问题。

[21] 普遍语法:一份适用于任何语言的规范表。

And famously, Chomsky claimed that children solved the problem of language acquisition by having the general design of language already wired into them in the form of a universal grammar. A spec sheet for what the rules of any language have to look like.


但有什么证据表明儿童天生就有普遍语法呢?

What is the evidence that children are born with a universal grammar?


令人惊讶的是,乔姆斯基利用一个抽象的论点——“刺激贫乏论”(The poverty of the input)——来论证这个观点,而非通过具体的实验或研究来证明。

Well, surprisingly, Chomsky didn’t propose this by actually studying kids in the lab or kids in the home, but through a more abstract argument called, “The poverty of the input.”


孩子们所输入的语言信息和他们成年后的语言能力相差甚远,只有假设孩子已经有了一定的语言能力,才能填补两者之间的鸿沟。

Namely,if you look at what goes into the ears of a child and look at the talent they end up with as adults, there is a big chasm between them that can only be filled in by assuming that the child has a lot of knowledge of the way that language works already built in.


下面解释“刺激贫乏论”的逻辑:在学习母语时,儿童必须要学会如何使用 “疑问句” ;而从父母的话语中,他们能找到一些疑问句规则的线索。

Here’s how the argument works.

One of the things that children have to learn when they learn English is how to form a question.


比如,他们会听到“The man is here(那个人在这里)”这样的句子,以及相应的问题,“Is the man here(那个人在这里吗)?”

Now, children will get evidence from parent’s speech to how the question rule works, such as sentences like, “The man is here,”and the corresponding question, “Is the man here?”


从逻辑上讲,一个得到这种输入的孩子可以脑补出两种不同的规则:其一是简单的线性序列规则(word linear rule),即找到句子中的第一个“is”,并将其移到前面。

原句: The man is here(那个人在这儿)。

疑问句: Is the man here(那个人在不在这儿)?

Now, logically speaking, a child getting that kind of input could posit two different kinds of rules. There’s a simple word by word linear rule. In this case, find the first “is” in the sentence and move it to the front.

“The man is here,”“Is the man here?”


同样的句子也可以让孩子可以脑补出一种更复杂的规则——结构依赖(structure dependent)规则。这种规则关注句子内各成分间的结构关系。回到疑问句的例子,具体的规则是:在主语名词短语 The man 后面找到第一个 is ,然后把它移到句子的前面。如图所示,主语名词短语之后出现的 is ,即应该移到句子前面的词。

Now there’s a more complex rule that the child could posit called a structure dependent rule, one that looks at the geometry of the phrase structure tree.In this case, the rule would be:find the first “is” after the subject noun phrase and move that to the front of the sentence.A diagram of what that rule would look like is as follows: you look for the “is”that occurs after the subject noun phrase and that’s what gets moved to the front of the sentence.



那么,线性序列规则和结构依赖规则之间有什么区别呢?

Now, what’s the difference between the simple word-by-word rule and the more complex structured dependent rule?


我们可以从一个稍微复杂的句子中看到不同之处。

比如: The man who is tall is in the room(那个高个子的人在房间里)

使用简单的线性序列规则会生成这样的疑问句: Is the man who tall is in the room(那个高不高的人的在房间里)? ——当然,这是没有意义的语词杂拌。

应该说 Is the man who is tall in the room(那个高的人在不在房间里)? 

这个例子说明结构依赖规则——而非线性序列规则——是正确的。

Well, you can see the difference when it comes to performing the question from a slightly more complex sentence like, “The man who is tall is in the room.”

Now, the simple word-by-word rule will give you”Is the man who tall is in the room?”

Now, of course that is word salad and makes no sense. It has to be”Is the man who is tall in the room?” Showing us that the structure dependent rule, not the word-by-word rule that is correct.


但孩子应该怎么学结构依赖规则呢?我们为什么最终会正确地使用结构依赖规则,而不是简单得多的线性序列规则呢?

But how is the child supposed to learn that? How did all of us end up with the correct structured dependent of the rule rather than the far simpler word-by-word version of the rule?


乔姆斯基认为,如果你仔细分析我们的日常对话,其实很少会听到 Is the man who is tall in the room 这样的句子。而这类输入的缺失在逻辑上会告诉你线性序列规则是错误的,而结构依赖规则是正确的。无论如何,在我们长大成人的过程中,都在无意识地使用结构依赖规则,而非线性序列规则。

“Well,” Chomsky argues, “if you were actually to look at the kind of language that all of us hear, it’s actually quite rare to hear a sentence like, “Is the man who is tall in the room? The kind of input that would logically inform you that the word-by-word rule is wrong and the structure dependent rule is right. Nonetheless, we all grow up into adults who unconsciously use the structure dependent rule rather than the word-by-word rule.


不仅如此,孩子们也不会说出 is the man who tall is in the room 这种句子;一旦他们开始组织复杂的疑问句,他们就会使用结构依赖规则。乔姆斯基由此认为,这证明了结构依赖规则是孩子与生俱来的“普遍语法”的一部分。

Moreover, children don’t make errors like, “is the man who tall is in the room,”

as soon as they begin to form complex questions, they use the structure dependent rule. And that,” Chomsky argues, “is evidence that structure dependent rules are part of the definition of universal grammar that children are born with.”



尽管乔姆斯基在语言科学上有着惊人的影响力,但并不意味着所有语言科学家都同意他的观点。这些年来,对乔姆斯基的批评也时有出现

Now, though Chomsky has been fantastically influential in the science of language that does not mean that all language scientists agree with him. And there have been a number of critiques of Chomsky over the years.


首先,批评者们指出,乔姆斯基并没有证明普遍语法的原则仅适用于语言,而不适用于其他如视觉、运动和记忆控制认知系统——我们无法确定普遍语法是语言特有的。

For one thing, the critics point out, Chomsky hasn’t really shown principles of universal grammar that are specific to language itself as opposed to general ways in which the human mind works across multiple domains, language and vision and control of motion and memory and so on. We don’t really know that universal grammar is specific to language, according to this critique.


其次,乔姆斯基和他的团队并没有穷尽世界上所有语言,来证明普遍语法原则的普适性;他们的结论仅仅基于寥寥几种语言和刺激贫乏论的逻辑,但实际上没有数据证明普遍语法是真正“普遍”的。

(拓展阅读:重磅 | Yann LeCun推荐:新证据出现,乔姆斯基的普遍语法理论正被颠覆

Secondly, Chomsky and the linguists working with him have not examined all 6,000 of the world’s languages and shown that the principles of universal grammar apply to all 6,000.They’ve posited it based on a small number of languages and the logic of the poverty of the input,but haven’t actually come through with the data that would be necessary to prove that universal grammar is really universal.


最后,有批评者认为,乔姆斯基没有具体说明语言到底独特在何处,一些更通用的习得模式如神经网络模型(neuro network models),能不能同样解决语言习得问题。

(拓展阅读:神经网络模型Neural network models | 数据常青藤

网址:http://www.dataivy.cn/blog/神经网络模型neural-network-models/)

Finally, the critics argue, Chomsky has not shown that more general purpose learning models,such as neuro network models, are incapable of learning language together with all the other things that children learn, and therefore has not proven that there has to be specific knowledge how grammar works in order for the child to learn grammar.



语言的另一职责是控制语音系统——排列组合元音和辅音来形成词句。

Another component of language governs the sound pattern of language, the ways that the vowels and consonants can be assembled into the minimal units that go into words.


· 规则——音系学(Phonology)


音系学是语言学的一个分支,研究语言中可以成词的各种音的排列规律。

Phonology,as this branch of linguistics is called, consists of formation rules that capture what is a possible word in a language according to the way that it sounds.


举个例子, bluk 不是一个英语单词,但是你会觉得总有一天人们可以创造一个读作 bluk 的新的英语词;但是当你听到 crachts [21]时,你马上就会知道,它不是也不能是英语单词。我们之所以认识到这个词不属于英语,是因为 crachts 这样的语音组合并不符合英语音系规则。

[21] crachts为音译,来自于依地语(犹太人使用的国际语),它有几分叹息或呻吟的意思。

To give you an example,the sequence, bluk, is not an English word, but you get a sense that it could be an English word that someone could coin a new term of English that we pronounce “bluk.”But when you hear the sound crachts, you instantly know that that not only isn’t it an English word, but it really couldn’t be an English word. crachts, by the way, comes from Yiddish and it means kind of to sigh or to moan.The reason that we recognize that it’s not English is because it has sounds like crachts and sequences like crachts, which aren’t part of the formation rules of English phonology.


还有一部分音系规则建立在成词规律基础上,研究一组音在不同条件下的实现。

But together with the rules that define the basic words of a language, there are also phonological rules that make adjustments to the sounds, depending on what the other words the word appears with.


例如,我们很少意识到,在英语中,过去时后缀 ed 实际上有三种不同的发音方式。

在说 He walked(他走) 时,我们把 ed 发音为 t ;当我们说 jogged(慢跑) 时,我们把它发音为 d ;在说 patted(拍) 时,动词与后缀之间插入了一个元音——同一个后缀 ed 可以根据英语音系规则调整发音。

Very few of us realize, for example, in English, that the past tense suffix “ed” is actually pronounced in three different ways. When we say, “He walked,” we pronounce the “ed” like a “t,” walked. When we say “jogged,”we pronounce it as a “d,” jogged. And when we say “patted,”we stick in a vowel, pat-ted, showing that the same suffix,“ed” can be readjusted in its pronunciation according to the rules of English phonology.



在二语习得中,母语的音系规则常常干扰习得外语的音系规则,口音由此产生。

(拓展阅读:趣读丨大扎好,我系渣渣辉!被明星们的谜之港普笑吐了

When a language user deliberately manipulates the rules of phonology, that is, when they don’t just speak in order to convey content, they pay attention as to what phonological structures are being used; we call it poetry and rhetoric.


而当语言使用者特别关注这些音系规则、特别是语言使用超越了表意功能时,语言的声音之美得以彰显,我们就有了诗歌和修辞学

Now, when someone acquires English as a foreign language or acquires a foreign language in general, they carry over the rules of phonology of their first language and apply it to their second language.We have a word for it;we call it an “accent.”


到目前为止,我一直在谈论语言知识,以及定义可能的语言序列规则。但是这些序列必须在语言理解过程中输入大脑,并在语言产生过程中输出——这就把我们带到了“语言接口”的问题上。

So far, I’ve been talking about knowledge of language, the rules that go into defining what are possible sequences of language. But those sequences have to get into the brain during speech comprehension and they have to get out during speech production. And that takes us to the topic of language interfaces.


· 接口(interface)——语音的产生


让我们从语音的产生开始。

And let’s start with production.



这是一张人体剖面图,我们可以横截面上看到声道。让我们用这个说明人是如何把语言知识编译为声音传播出来的。

This diagram here is literally a human cadaver that has been sawn in half. An anatomist took a saw and [sound] allowing it to see in cross section the human vocal tract. And that can illustrate how we get out knowledge of language out into the world as a sequence of sounds.


我们每个人的气管或气管顶部都有一个复杂的结构,叫做喉管或声匣,就在你的喉结后面。从你的肺里出来的空气必须经过两个软骨瓣,它们振动并产生一个丰富的、嗡嗡作响的、充满和声的声源;在这种振动的声音传播到外界之前,它必须通过声道的一个或多个腔室、舌头后面的喉咙、舌头上方的空腔、嘴唇形成的空腔;当你阻挡气流通过嘴时,它可以从鼻子出来。

Now, each of us has at the top of our windpipe or trachea, a complex structure called the larynx or voice box; it’s behind your Adam’s Apple. And the air coming out of your lungs have to go passed two cartilaginous flaps that vibrate and produce a rich, buzzy sound source, full of harmonics.Before that vibrating sound gets out to the world, it has to pass through a gauntlet or chambers of the vocal tract.The throat behind the tongue, the cavity above the tongue, the cavity formed by the lips, and when you block off airflow through the mouth, it can come out through the nose.



每一个空腔都有一个形状,由于物理定律,它将放大该声源中的一些和声,并抑制另一些和声。当我们移动舌头时,我们可以改变这些空腔的形状。当我们前后移动舌头时,例如在 eh 、 aa 、 eh 、 aa 时,我们改变了舌头后面的腔体形状,继而改变了被放大或抑制的频率,听者以两个不同的元音听到它们。

Now, each one of those cavities has a shape that, thanks to the laws of physics, will amplify some of the harmonics in that buzzy sound source and suppress others. We can change the shape of those cavities when we move our tongue around. When we move our tongue forward and backward, for example, as in “eh,” “aa,” “eh,” “aa,”we change the shape of the cavity behind the tongue, change the frequencies that are amplified or suppressed and the listener hears them as two different vowels.



同样,当我们抬起或放下舌头时,我们改变了舌头上共振腔的形状,如 eh 、 ah 、 eh 、 ah ——和声的变化又一次被认为是元音的变化。当我们屏住气,然后发出 t 、 ca 、 ba 的声音时,我们听到的即是辅音,当我们发出 f 、 ss 的声音时,甚至会产生一种混沌的噪音。每一个由不同的发音器官发出的声音,都会被听成不同的元音或辅音。

Likewise, when we raise or lower the tongue, we change the shape of the resonant cavity above the tongue as in say, “eh,”“ah,”“eh,”“ah.” Once again, the change in the mixture of harmonics is perceived as a change in the nature of the vowel. When we stop the flow of air and then release it as in, “t,”“ca,”“ba.”Then we hear a consonant rather than a vowel or even when we restrict the flow of air as in“f,” “ss” producing a chaotic noisy sound. Each one of those sounds that gets sculpted by different articulators is perceived by the brain as a qualitatively different vowel or consonant.


人类声道的一个有趣的特点是:它同时容纳了呼吸和吞咽等功能,是一个“共享结构”。

Now, an interesting peculiarity of the human vocal track is that it obviously co-ops structures that evolved for different purposes for breathing and for swallowing and so on.


这个有趣的事实首先由达尔文指出——进化过的喉已经下降到颈前部,使得从口腔通过食道到胃的食物必须经过该开口进入喉部,从而导致窒息死亡的危险。事实上,在海姆利克急救法(the Heimlich Maneuver)发明之前,每年有几千人因为人类声道的“缺陷”而死于窒息。

And it’s an interesting fact first noted by Darwin that the larynx over the course of evolution has descended in the throat so that every particle of food going from the mouth through the esophagus to the stomach has to pass over the opening into the larynx with some probability of being inhaled leading to the danger of death by choking. And in fact, until the invention of the Heimlich Maneuver, several thousand people every year died of choking because of this maladaptive of the human vocal tract.


为什么我们会进化出一种使我们容易窒息的口腔和喉咙?

Why did we evolve a mouth and throat that leaves us vulnerable to choking?


一个看似合理的假设是,这是在进化过程中达成的一种妥协,来允许我们说话。

Well, a plausible hypothesis is that it’s a compromise that was made in the course of evolution to allow us to speak.


我们通过增加共振腔交替的各种可能性,上下来回地动舌头,扩大了发声范围,提高了语言效率,但同时也增加了窒息风险。这表明语言可能具有某种生存优势,弥补了窒息的缺点。

By giving range to a variety of possibilities for alternating the resonant cavities, for moving the tongue back and forth and up and down, we expanded the range of speech sounds we could make, improve the efficiency of language, but suffered the compromise of an increased risk of choking showing that language presumably had some survival advantage that compensated for the disadvantage in choking.


· 接口(interface)——语言理解


那么,我们该如何解释信息流动的另一个方向,即从外界进入大脑——语言理解的过程呢?

What about the flow of information in the other direction, that is from the world into the brain, the process of speech comprehension?


语言理解是一个非常复杂的计算过程。当我们与电话里的语音信箱菜单交流,或者在电脑上使用听写功能时,这一过程的复杂程度就更加明显了。

Speech comprehension turns out to be an extraordinarily complex computational process, which we're reminded of every time we interact with a voicemail menu on a telephone or you use a dictation on our computers.


比如,有个作家想利用最先进的语音文字转换系统录入 book tour(新书宣传) 这几个字,然而屏幕上总是显示 back to work(返回工作) ;想要录入 I truly couldn’t see(我真的看不见) ,屏幕上却显示, a cruelly good MC(一个残酷的好主持) 。更尴尬的是,当他想给父母写一封信,录入 Dear mom and dad(亲爱的爸爸妈妈) ,而屏幕上的反馈却是 The man is dead(那个人死了) 。

For example, One writer, using the state-of-the-art speech-to-text systems dictated the following words into his computer. He dictated “book tour,” and it came out on the screen as “back to work.”Another example, he said, “I truly couldn’t see,” and it came out on the screen as, “a cruelly good MC.”Even more disconcertingly, he started a letter to his parents by saying, “Dear mom and dad,” and what came out on the screen, “The man is dead.”


虽然听写系统越来越好,但是还无法取代人类速记员;问题是,究竟是什么让语音理解对于一个人来说如此容易,但对于计算机来说却如此困难?

有两个主要的原因。

Now, dictation systems have gotten better and better, but they still have a way to go before they can duplicate a human stenographer. What is it about the problem of speech understanding that makes it so easy for a human, but so hard for a computer?

Well, there are two main contributors.


其中一个原因是,每一个音素在不同的语音条件下的实际发音是不同的——这取决于它们之前和之后都是什么。这种现象有时被称为“协同发音”。

One of them is the fact that each phony, each vowel or consonant actually comes out very differently, depending on what comes before and what comes after. A phenomenon sometimes called co-articulation.


举个例子。这个叫 Cape Cod(科德角) 的地方有两个 c 音。每一个符号都以字母 C 来表示;尽管如此,你发声的地方实际上是不一样的:一种 c 产生于口腔后方;另一种 c 产生在前方。我们没有注意到我们用两种不同的方式发音 c ,这取决于它是在 a 之前,还是 ah 之前,但这种差异导致了我们口中共振腔的不同形状,而产生一种完全不同的波形。

Let me give you an example.

The place called Cape Cod has two “c” sounds. Each of them symbolized by the letter “C,” the hard “C.” Nonetheless, when you pay attention to the way you pronounce them, you notice that in fact, you pronounce them in very different parts of the mouth.

Try it. Cape Cod, Cape Cod… “c,” “c”.

In one case, the “c” is produced way back in the mouth; the other it’s produced much farther forward. We don’t notice that we pronounce “c” in two different ways depending whether it comes before an“a” or an “ah,” but that difference forms a difference in the shape of the resonant cavity in our mouth which produces a very different wave form.


除非一台计算机被专门编程来考虑到这种变化,否则它会把这两个不同的 c 看作是一种不同的声音,客观上说, c-eh、c-oa 确实是不同的声音,但我们的大脑把它们混为一谈。

And unless a computer is specifically programmed to take that variability into account, it will perceive those two different“c’s,” as a different sound that objectively speaking, they really are:“c-eh” “c-oa”

They really are different sounds, but our brain lumps them together.


造成语音识别困难的另一个原因是,语音流没有分节。

The other reason that speech recognition is such a difficult problem is because of the absence of segmentation.


我们有一种错觉,我们听到的话语是由一系列与单词对应的声音组成的。但是,如果你实际观察一下示波器上句子的波形,就会发现单词之间不会有像空格一样的间隔,而是一条连续的丝带,一个词的结尾就指向下一个单词的开头

Now we have an illusion when we listen to speech that consists of a sequence to sounds corresponding to words. But if you actually were to look at the wave form of a sentence on a oscilloscope, there would not be little silences between the words the way there are little bits of white space in printed words on a page, but rather a continuous ribbon in which the end of one word leads right to the beginning of the next.



这是当我们在听外语演讲的时候意识到的东西——我们不知道一个词在哪里结束,另一个词从哪里开始。使用熟悉的语言时,我们之所以可以检测单词的边界,仅仅是因为在我们的心理词典中,每个词都有对应的声音,提示我们词的结尾在哪。

It’s something that we’re aware of when we listen to speech in a foreign language when we have no idea where one word ends and the other one begins. In our own language,we detect the word boundaries simply because in our mental lexicon, we have stretches of sound that correspond to one word that tell us where it ends.


但你不能从波形本身得到这种信息。事实上,一种文字游戏就利用了 “语音波形中不存在文字边界” 这个事实。让我们看一段看起来没有意义的歌词: Mairzy doats and dozy doats and liddle lamzy divey kiddley divey do, wooden shoe。 

But you can’t get that information from the wave form itself.In fact, there’s a whole genre of wordplay that takes advantage of the fact that word boundaries are not physically present in the speech wave. Novelty songs like “Mairzy doats and dozy doats and liddle lamzy divey kiddley divey do, wooden shoe.


如果我们读出来会发现,其实每个词都是英语词,连成的句子也符合语法:Mares eat oats and does eat oats and little lambs eat ivy, a kid'll eat ivy too, wouldn’t you(母马吃燕麦,雌鹿吃燕麦,小羊羔吃常春藤,小山羊也会吃常春藤,不是吗)?当这段歌词被连起来说的时候,词之间的界限被消除了,所以同样的声音序列会被当成词语杂拌。

Now,it turns out that this is actually a grammatical sequence in words in English… Mares eat oats and does eat oats and little lambs eat ivy, a kid'll eat ivy too, wouldn’t you?

When it is spoken or sung normally, the boundaries between words are obliterated and so the same sequence of sounds can be perceived either as nonsense or if you know what they’re meant to convey, as sentences.



还有这样的童谣:  Fuzzy Wuzzy was a bear, Fuzzy Wuzzy had no hair. Fuzzy Wuzzy wasn’t very fuzzy, was he(毛茸茸的伍兹是一只熊,毛茸茸的伍兹没有头发。毛茸茸的乌兹不是很毛茸茸的,是吗)? 

Another example familiar to most children, Fuzzy Wuzzy was a bear, Fuzzy Wuzzy had no hair. Fuzzy Wuzzy wasn’t very fuzzy, was he?



还有著名的绕口令: I scream, you scream, we all scream for ice cream(我尖叫,你尖叫,我们都为冰淇淋尖叫) 。只有念出来,才能发现他们其实都是巧妙地模糊了词之间的边界。

And the famous dog roll, I scream, you scream, we all scream for ice cream.



平时说话的时候,我们很少意识到句子的歧义。这是因为在一个语境中,我们能毫不费力地推导出话语的意图。但是一台可怜的计算机并没有配备我们所有的常识,而仅仅是通过单词与规则运作,往往会因为各种不同的可能性而一脸懵逼。

We are generally unaware of how unambiguous language is.In context, we effortlessly and unconsciously derive the intended meaning of a sentence, but a poor computer not equipped with all of our common sense and human abilities and just going by the words and the rules is often flabbergasted by all the different possibilities.


就拿一个简单的句子来说,像 Mary had a little lamb(玛丽要了一只小羊羔) 这句话,你可能会认为这是一个非常简单、没有歧义的句子。

Take a sentence as simple as “Mary had a little lamb,”you might think that that’s a perfectly simple unambiguous sentence.



现在,我们给它后面加上一个定语 with mint sauce(蘸有薄荷酱的) 。现在一下你就意识到“要”(英文原句中的have)实际上是有歧义的。

But now imagine that it was continued with “with mint sauce.” You realize that “have” is actually a highly ambiguous word.



因此,计算机翻译常常会带来可笑的错误。据说,第一批计算机系统从英语翻译到俄语,然后再翻译回来,把 The spirit is willing, but the flesh is weak(心有余而力不足) 这样的句子翻译成了 The vodka is agreeable, but the meat is rotten(伏特加很好喝,但肉烂了) 。

As a result, the computer translations can often deliver comically incorrect results. According to legend, one of the first computer systems that was designed to translate from English to Russian and back again did the following given the sentence, “The spirit is willing, but the flesh is weak,” it translated it back as “The vodka is agreeable, but the meat is rotten.”


· 接口(interface)——语用学(Pragmatics)


那么,为什么人们更善于理解语言呢?语言中有什么知识是如此之难、甚至无法编译?

So why do people understand language so much better than computers? What is the knowledge that we have that has been so hard to program into our machines?


语言和头脑的其他部分之间还有第三种“接口”,这就是语言学分支——语用学的主题,即人们如何利用“他们对世界的了解”以及“他们对其他发言者如何交流的期望”来理解语言。

Well, there’s a third interface between language and the rest of the mind, and that is the subject matter of the branch of linguistics called Pragmatics, namely, how people understand language in context using their knowledge of the world and their expectation about how other speakers communicate.


语用学最重要的原则叫做“合作原则”——也就是说,每一段对话都是说话人与听话人的一次合作,兑换的双方一起推导交流意图。

The most important principle of Pragmatics is called “the cooperative principle,” namely; assume that your conversational partner is working with you to try to get a meaning across truthfully and clearly.


我们应用语用学的知识,就像运用句法和音系知识一样,虽然看起来毫不费力,但其实涉及到许多复杂的计算。

例如,如果我说,“如果你能把鳄梨酱递给我就太棒了。”你明白这是个礼貌的要求,意思是请你给我鳄梨酱;你不会把它理解成对一件假想的事情的思考,你只会认为这个人是想要什么,他用这一系列词语是要礼貌地传达请求。

And our knowledge of Pragmatics, like our knowledge of syntax and phonology and so on, is deployed effortlessly, but involves many intricate computations.

For example, if I were to say, “If you could pass the guacamole, that would be awesome.”You understand that as a polite request meaning, give me the guacamole. You don’t interpret it literally as a rumination about a hypothetical affair, you just assume that the person wanted something and was using that string of words to convey the request politely.



喜剧通常会利用语用的缺失作为幽默的来源。就像老版的《糊涂侦探(Get Smart情景喜剧,其中有一个机器人名叫嗨米(Hymie,“Hi,me”的谐音),在这个系列中,一个反复出现的笑话是,麦克斯韦·斯马特会对嗨米说:“Hymie,can you give me a hand(嗨米,你能帮我一下吗)?”

Often comedies will use the absence of pragmatics in robots as a source of humor. As in the old “Get Smart” situation comedy, which had a robot named, Hymie, and a recurring joke in the series would be that Maxwell Smart would say to Hymie, “Hymie, can you give me a hand?


然后嗨米就会过来,把他的手拿下来交给麦克斯韦·斯马特,不明白 give me a hand(帮把手) 在语境中意思是帮助我,而不是真的把手转交给我。

And then Hymie would go, {sound}, remove his hand and pass it over to Maxwell Smart not understanding that “give me a hand,” in context means, help me rather than literally transfer the hand over to me.



请你考虑这样一段对话。

女朋友玛莎对男朋友约翰说:“我要离开你了。”

约翰说:“那男的是谁?”

Or take the following example of Pragmatics in action.

Consider the following dialogue, Martha says, “I’m leaving you.”

John says, “Who is he?”


要想理解这段对话,就需要找出名词代词的所指,即在这种情况下, He (那男的)指的是谁。会说英语的人可能知道,这句话里“他”(那男的)大概是约翰的情敌,即使对话中并没有明确指出这一点。这个例子就说明了,我们是如何运用大量的关于人类行为、人类互动、人际关系的知识来理解语言的。我们经常要用这些背景知识来解决像“代词指的是谁”这样的问题。这些知识非常难,至少要在计算机中用编程实现非常难。

Now, understanding language requires finding the antecedents pronouns, in this case who the“he” refers to, and any competent English speaker knows exactly who the “he” is, presumably John’s romantic rival even though it was never stated explicitly in any part of the dialogue. This shows how we bring to bear on language understanding a vast store of knowledge about human behavior, human interactions, human relationships. And we often have to use that background knowledge even to solve mechanical problems like, who does a pronoun like “he” refer to. It’s that knowledge that’s extraordinarily difficult, to say the least to program into a computer.




结语

语言是自然界的奇迹,它让我们用有限的思维工具来交流无限的思想;这个思维工具不仅包含了大量的词汇,还包括了把词汇组合排列的强大心理语法。语言不应与书写、语法规范/文体规则、或思维本身相混淆。


现代语言学以语言学家诺姆·乔姆斯基的问题为指导(虽然并不总是以他的结论作为答案),即语言为什么会有无限创造力:

 · 把词与词连起来组织成句的抽象认知结构应该是什么样子的?儿童如何习得这个结构? 

 · 不同的语言有那些普遍性?语言的普遍性又如何解释人类的心智?


语言的研究有许多实际应用,包括计算机的理解和发言、语言障碍的诊断和治疗,阅读、写作和外语的教学,以及法律、政治和文学语言的解释。


但对于像我这样的人来说,语言永远是迷人的,因为它要解决的是人类认知最基本的问题——语言实际上处于思想、社会关系、生理、进化等等一系列人类最根本问题的交叉点。



语言是人类最独特的天赋,它也是窥探人性的窗口;最重要的是,语言的巨大表现力是自然界的奇迹之一。

谢谢。



讲座链接:

https://www.youtube.com/watch?v=Q-B_ONJIEcE&t=1840s

文中图片来自讲座视频截图和网络


END



推荐阅读

The Better Angels of Our Nature,Steven Pinker,2011

《人性中的善良天使——一部人类新史》,史蒂芬·平克

“扎克伯格认为,Pinker的研究能提供一种改变人生的观点,他探究了暴力尽管被24小时新闻和社交媒体放大,是如何随着时间推移而减少的。值得注意的是,这本书也是比尔·盖茨认为他读过的最重要的书之一。”


The Language InstinctSteven Pinker,1995

《语言本能》,史蒂芬·平克著,欧阳明亮译,2015


The Blank Slate,Steven Pinker,2002

《白板》,史蒂芬·平克著,袁冬华译,2016


The Stuff of Thought,Steven Pinker,2007

《思想本质》 ,史蒂芬·平克著,张旭红、梅德明译,2015


How the Mind Works,Steven Pinker,2009

《心智探奇》,史蒂芬·平克著,郝耀伟译,2016


“1990 年,平克和他的学生保罗•布鲁姆(Paul Bloom,现耶鲁大学心理学教授)联名发表论文《自然语言和自然选择》(Natural Language and Natural Selection),在学术界引起巨大反响。在这篇论文启发下,平克出版了《语言本能》,该书迅速走俏,成为当时的畅销书,随后还入选《美国科学家》(American Scientist)评出的 “20 世纪百本最佳科学书籍”。凭借此书的成功,平克得以拓展研究范围,重新开始思索更宽泛的人性问题。2016年年底,他该系列的最后一本《白板》,也终于引进国内翻译完成。”



知识星期,一周即焚

公众号后台回复 “语言学”,获取推荐阅读中所有书籍的电子版~知识星期每周不定时更新,一周即焚,获取知识请勤奋!



作者介绍

东南大学风景园林专业

大三在读

哈尔滨工业大学风景园林专业

大三在读



推荐人介绍

清华大学外文系本科,目前马里兰大学语言学博士在读。

研究儿童语言习得,方向主要为理论语义学和语义习得。

院系个人主页:http://ling.umd.edu/people/person/yuan-yang/



雷锋福利

欢迎添加“全球知识雷锋机器人”,邀请您加入北美知识雷锋粉丝群,由名校作者坐镇,会不时分享学术资讯,欢迎进群交流!


文末彩蛋

配乐理由

感谢中国音乐学院作曲系,独立音乐人黄笛的科普


根据乔姆斯基的说法,语言的本质就是“递归”(recursion)。“递归”指的是相同的规则可以在一个结构里重复使用。例如:小明说小红告诉他小张昨天看见小刘和小李讨论小王在小赵家里陪小陈和小杨聊关于小徐说小万觉得...

而巴赫的曲子全是 recursion ,因此语言学学生们常开玩笑说,学习语言学就要听巴赫。巴赫是巴洛克时期作曲家,那个时候的音乐属于复调音乐(polyphonic music)。复调音乐就是由一个主题衍生出多个声部,不断的通过模进、移位、倒影、逆行等变化这个主题,但是翻过来覆过去,始终都还是这个原始的主题。后来有人尝试把巴赫一首曲子的各个声部链接起来,发现是一个完美的循环。从另一个角度讲,巴赫的主题会在各个声部轮流出现,而且几乎不间断,就很像每个声部在进行对话,你一句我一句,中间时不时还会有人插话,扯跑题再扯回来。

再给大家推荐两首代表性的曲子:Cello Suite No. 1 in G major (BWV 1007): I. Prelude;Air on the G String

点击“阅读原文”查看拓展阅读——史蒂芬·平克:想要和平,先了解心理学 | 科学人 | 果壳网 科技有意思

    您可能也对以下帖子感兴趣

    文章有问题?点此查看未经处理的缓存