查看原文
其他

刊讯|SSCI 期刊《计算语言学》 2023年第1-4期

七万学者关注了→ 语言学心得
2024-09-03

COMPUTATIONAL LINGUISTICS

Volume 49, Issue 1-4, 2023

COMPUTATIONAL LINGUISITICS (SSCI一区,2022 IF:9.3) 2023年第1-4期共发文27篇,其中研究性论文19篇,书评5篇,调查3项。研究论文涉及Transformers、多任务学习、机器翻译、自然语言生成、情感分析、数据标注、自然语言推理、跨语言句法、语言变异、类型学、语言演化欢迎转发扩散!

往期推荐:

刊讯|SSCI 期刊《计算语言学》2022年第48卷第1-4期

目录


ISSUE 1

Articles

Dimensional Modeling of Emotions in Text with Appraisal Theories: Corpus Creation, Annotation Reliability, and Prediction by Enrica Troiano, Laura Oberländer, Roman Klinger, Pages 1–72

Transformers and the Representation of Biomedical Background Knowledge by Oskar Wysocki, Zili Zhou, Paul O’Regan, Deborah Ferreira, Magdalena Wysocka, Pages 73–115.

It Takes Two Flints to Make a Fire: Multitask Learning of Neural Relation and Explanation Classifiers by Zheng Tang, Mihai Surdeanu, Pages 117–156.

Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future by Jan-Christoph Klie, Bonnie Webber, Iryna Gurevych, Pages 157–198.

■ Curing the SICK and Other NLI Maladies by Aikaterini-Lida Kalouli, Hai Hu, Alexander F. Webb, Lawrence S. Moss, Valeria de Paiva, Pages 199–243.


Book Reviews

■ Finite-State Text Processing by Aniello De Santo, Pages 245–247.

■ Validity, Reliability, and Significance: Empirical Methods for NLP and Data Science by Richard Futrell, Pages 249–251.

Pretrained Transformers for Text Ranking: BERT and Beyond by Suzan Verberne, Pages 253–255.

Conversational AI: Dialogue Systems, Conversational Agents, and Chatbots by Michael McTear by Olga Seminck, Pages 257–259.


ISSUE 2

Articles

Gradual Modifications and Abrupt Replacements: Two Stochastic Lexical Ingredients of Language Evolution by Michele Pasquini, Maurizio Serva, Davide Vergni, Pages 301–323.

Data-driven Cross-lingual Syntax: An Agreement Study with Massively Multilingual Models by Andrea Gregor de Varda, Marco Marelli, Pages 261–299.

Onception: Active Learning with Expert Advice for Real World Machine Translation by Vânia Mendonça, Ricardo Rei, Luísa Coheur, Alberto Sardinha, Pages 325–372.

■ Reflection of Demographic Background on Word Usage by Aparna Garimella, Carmen Banea, Rada Mihalcea, Pages 373–394.

■ Certified Robustness to Text Adversarial Attacks by Randomized [MASK] by Jiehang Zeng, Jianhan Xu, Xiaoqing Zheng, Xuanjing Huang, Pages 395–427.

■ The Analysis of Synonymy and Antonymy in Discourse Relations: An Interpretable Modeling Approach by Asela Reig Alamillo, David Torres Moreno, Eliseo Morales González, Mauricio Toledo Acosta, Antoine Taroni, Pages 429–464.


SURVEY PAPERS

From Word Types to Tokens and Back: A Survey of Approaches to Word Meaning Representation and Interpretation by Marianna Apidianaki, Pages 465–523.


ISSUE 3

ARTICLE

■ Comparing Selective Masking Methods for Depression Detection in Social Media, by Chanapa Pananookooln, Jakrapop Akaranee, Chaklam Silpasuwanchai, Pages 525-553.

■ Cross-Lingual Transfer with Language-Specific Subnetworks for Low-Resource Dependency Parsing, by Rochelle Choenni, Dan Garrette, Ekaterina Shutova, Pages 613–641.

■ Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model, by Chris van der Lee, Thiago Castro Ferreira, Chris Emmery, Travis J. Wiltshire, Emiel Krahmer, Pages 555–592.


BOOK REVIEW

■ Statistical Methods for Annotation Analysis, by Rodrigo Wilkens, Pages 593–633.


SURVEY PAPAERS

■ Grammatical Error Correction: A Survey of the State of the Art, by Christopher Bryant, Zheng Yuan, Muhammad Reza Qorib, Hannan Cao, Hwee Tou Ng, Ted Briscoe, Pages 643-701.

■ Machine Learning for Ancient Languages: A Survey, by Thea Sommerschield, Yannis Assael, John Pavlopoulos, Vanessa Stefanak, Andrew Senior, Chris Dyer, John Bodel, Jonathan Prag, Ion Androutsopoulos, Nando de Freitas, Pages 703–747.



ISSUE 4

ARTICLE

■ Measuring Attribution in Natural Language Generation Models, by Hannah Rashkin, Vitaly Nikolaev , Matthew Lamm, Lora Aroyo, Michael Collins, Dipanjan Das, Slav Petrov, Gaurav Singh Tomar, Iulia Turc, David Reitter, Pages 777–840.

■ Capturing Fine-Grained Regional Differences in Language Use through Voting Precinct Embeddings, by Alex Rosenfeld, Lars Hinrichs, Pages 883–942.

■ Generation and Polynomial Parsing of Graph Languages with Non-Structural Reentrancies, by Johanna Björklund, Frank Drewes, Anna Jonsson, Pages 841–882.

■ Languages Through the Looking Glass of BPE Compression , by Ximena Gutierrez-Vasques, Christian Bentz, Tanja Samardžić, Pages 943–1001.

■ Language Embeddings Sometimes Contain Typological Generalizations, by Robert Östling, Murathan Kurfalı, Pages 1003–1051.


摘要

Dimensional Modeling of Emotions in Text with Appraisal Theories: Corpus Creation, Annotation Reliability, and Prediction

Enrica Troiano, Institut für Maschinelle Sprachverarbeitung, University of Stuttgart 

Laura Oberländer, Institut für Maschinelle Sprachverarbeitung, University of Stuttgart 

Roman Klinger, Institut für Maschinelle Sprachverarbeitung, University of Stuttgart

Abstract The most prominent tasks in emotion analysis are to assign emotions to texts and to understand how emotions manifest in language. An important observation for natural language processing is that emotions can be communicated implicitly by referring to events alone, appealing to an empathetic, intersubjective understanding of events, even without explicitly mentioning an emotion name. In psychology, the class of emotion theories known as appraisal theories aims at explaining the link between events and emotions. Appraisals can be formalized as variables that measure a cognitive evaluation by people living through an event that they consider relevant. They include the assessment if an event is novel, if the person considers themselves to be responsible, if it is in line with their own goals, and so forth. Such appraisals explain which emotions are developed based on an event, for example, that a novel situation can induce surprise or one with uncertain consequences could evoke fear. We analyze the suitability of appraisal theories for emotion analysis in text with the goal of understanding if appraisal concepts can reliably be reconstructed by annotators, if they can be predicted by text classifiers, and if appraisal concepts help to identify emotion categories. To achieve that, we compile a corpus by asking people to textually describe events that triggered particular emotions and to disclose their appraisals. Then, we ask readers to reconstruct emotions and appraisals from the text. This set-up allows us to measure if emotions and appraisals can be recovered purely from text and provides a human baseline to judge a model’s performance measures. Our comparison of text classification methods to human annotators shows that both can reliably detect emotions and appraisals with similar performance. Therefore, appraisals constitute an alternative computational emotion analysis paradigm and further improve the categorization of emotions in text with joint models.


Transformers and the Representation of Biomedical Background Knowledge

Oskar Wysocki, Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, University of Manchester

Zili Zhou, Department of Computer Science, University of Manchester

Paul O’Regan, Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, University of Manchester

AbstractSpecialized transformers-based models (such as BioBERT and BioMegatron) are adapted for the biomedical domain based on publicly available biomedical corpora. As such, they have the potential to encode large-scale biological knowledge. We investigate the encoding and representation of biological knowledge in these models, and its potential utility to support inference in cancer precision medicine—namely, the interpretation of the clinical significance of genomic alterations. We compare the performance of different transformer baselines; we use probing to determine the consistency of encodings for distinct entities; and we use clustering methods to compare and contrast the internal properties of the embeddings for genes, variants, drugs, and diseases. We show that these models do indeed encode biological knowledge, although some of this is lost in fine-tuning for specific tasks. Finally, we analyze how the models behave with regard to biases and imbalances in the dataset.


It Takes Two Flints to Make a Fire: Multitask Learning of Neural Relation and Explanation Classifiers

Zheng Tang, University of Arizona, Department of Computer Science

Mihai Surdeanu, University of Arizona, Department of Computer Science

AbstractWe propose an explainable approach for relation extraction that mitigates the tension between generalization and explainability by jointly training for the two goals. Our approach uses a multi-task learning architecture, which jointly trains a classifier for relation extraction, and a sequence model that labels words in the context of the relations that explain the decisions of the relation classifier. We also convert the model outputs to rules to bring global explanations to this approach. This sequence model is trained using a hybrid strategy: supervised, when supervision from pre-existing patterns is available, and semi-supervised otherwise. In the latter situation, we treat the sequence model’s labels as latent variables, and learn the best assignment that maximizes the performance of the relation classifier. We evaluate the proposed approach on the two datasets and show that the sequence model provides labels that serve as accurate explanations for the relation classifier’s decisions, and, importantly, that the joint training generally improves the performance of the relation classifier. We also evaluate the performance of the generated rules and show that the new rules are a great add-on to the manual rules and bring the rule-based system much closer to the neural models.


Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future

Jan-Christoph Klie, Ubiquitous Knowledge Processing Lab, Department of Computer Science, Technical University of Darmstadt

Bonnie Webber, School of Informatics, University of Edinburgh

Iryna Gurevych, UKP Lab / TU Darmstadt

AbstractAnnotated data is an essential ingredient in natural language processing for training and evaluating machine learning models. It is therefore very desirable for the annotations to be of high quality. Recent work, however, has shown that several popular datasets contain a surprising number of annotation errors or inconsistencies. To alleviate this issue, many methods for annotation error detection have been devised over the years. While researchers show that their approaches work well on their newly introduced datasets, they rarely compare their methods to previous work or on the same datasets. This raises strong concerns on methods’ general performance and makes it difficult to assess their strengths and weaknesses. We therefore reimplement 18 methods for detecting potential annotation errors and evaluate them on 9 English datasets for text classification as well as token and span labeling. In addition, we define a uniform evaluation setup including a new formalization of the annotation error detection task, evaluation protocol, and general best practices. To facilitate future research and reproducibility, we release our datasets and implementations in an easy-to-use and open source software package.


Curing the SICK and Other NLI Maladies

Aikaterini-Lida Kalouli, Center for Information and Language Processing (CIS), LMU Munich

Hai Hu, Shanghai Jiao Tong University School of Foreign Languages

Alexander F. Webb, Indiana University Bloomington Department of Philosophy

Abstract Against the backdrop of the ever-improving Natural Language Inference (NLI) models, recent efforts have focused on the suitability of the current NLI datasets and on the feasibility of the NLI task as it is currently approached. Many of the recent studies have exposed the inherent human disagreements of the inference task and have proposed a shift from categorical labels to human subjective probability assessments, capturing human uncertainty. In this work, we show how neither the current task formulation nor the proposed uncertainty gradient are entirely suitable for solving the NLI challenges. Instead, we propose an ordered sense space annotation, which distinguishes between logical and common-sense inference. One end of the space captures non-sensical inferences, while the other end represents strictly logical scenarios. In the middle of the space, we find a continuum of common-sense, namely, the subjective and graded opinion of a “person on the street.” To arrive at the proposed annotation scheme, we perform a careful investigation of the SICK corpus and we create a taxonomy of annotation issues and guidelines. We re-annotate the corpus with the proposed annotation scheme, utilizing four symbolic inference systems, and then perform a thorough evaluation of the scheme by fine-tuning and testing commonly used pre-trained language models on the re-annotated SICK within various settings. We also pioneer a crowd annotation of a small portion of the MultiNLI corpus, showcasing that it is possible to adapt our scheme for annotation by non-experts on another NLI corpus. Our work shows the efficiency and benefits of the proposed mechanism and opens the way for a careful NLI task refinement.


Gradual Modifications and Abrupt Replacements: Two Stochastic Lexical Ingredients of Language Evolution

Michele Pasquini, Istituto per le Applicazioni del Calcolo, “Mauro Picone” - CNR, Rome, Italy

Maurizio Serva, Dipartimento di Ingegneria e Scienze, dell’Informazione e Matematica, Università dell’Aquila, L’Aquila, Italy

Davide Vergni, Istituto per le Applicazioni del Calcolo, “Mauro Picone” - CNR, Rome, Italy

AbstractThe evolution of the vocabulary of a language is characterized by two different random processes: abrupt lexical replacements, when a complete new word emerges to represent a given concept (which was at the basis of the Swadesh foundation of glottochronology in the 1950s), and gradual lexical modifications that progressively alter words over the centuries, considered here in detail for the first time. The main discriminant between these two processes is their impact on cognacy within a family of languages or dialects, since the former modifies the subsets of cognate terms and the latter does not. The automated cognate detection, which is here performed following a new approach inspired by graph theory, is a key preliminary step that allows us to later measure the effects of the slow modification process. We test our dual approach on the family of Malagasy dialects using a cladistic analysis, which provides strong evidence that lexical replacements and gradual lexical modifications are two random processes that separately drive the evolution of languages.


Data-driven Cross-lingual Syntax: An Agreement Study with Massively Multilingual Models

Andrea Gregor de Varda, University of Milano-Bicocca

Marco Marelli, University of Milano-Bicocca

AbstractMassively multilingual models such as mBERT and XLM-R are increasingly valued in Natural Language Processing research and applications, due to their ability to tackle the uneven distribution of resources available for different languages. The models’ ability to process multiple languages relying on a shared set of parameters raises the question of whether the grammatical knowledge they extracted during pre-training can be considered as a data-driven cross-lingual grammar. The present work studies the inner workings of mBERT and XLM-R in order to test the cross-lingual consistency of the individual neural units that respond to a precise syntactic phenomenon, that is, number agreement, in five languages (English, German, French, Hebrew, Russian). We found that there is a significant overlap in the latent dimensions that encode agreement across the languages we considered. This overlap is larger (a) for long- vis-à-vis short-distance agreement and (b) when considering XLM-R as compared to mBERT, and peaks in the intermediate layers of the network. We further show that a small set of syntax-sensitive neurons can capture agreement violations across languages; however, their contribution is not decisive in agreement processing.


Onception: Active Learning with Expert Advice for Real World Machine Translation

Vânia Mendonça, INESC-ID, Instituto Superior Técnico

Ricardo Rei, INESC-ID, Instituto Superior Técnico, Unbabel AI

Luísa Coheur, INESC-ID, Instituto Superior Técnico

AbstractActive learning can play an important role in low-resource settings (i.e., where annotated data is scarce), by selecting which instances may be more worthy to annotate. Most active learning approaches for Machine Translation assume the existence of a pool of sentences in a source language, and rely on human annotators to provide translations or post-edits, which can still be costly. In this article, we apply active learning to a real-world human-in-the-loop scenario in which we assume that: (1) the source sentences may not be readily available, but instead arrive in a stream; (2) the automatic translations receive feedback in the form of a rating, instead of a correct/edited translation, since the human-in-the-loop might be a user looking for a translation, but not be able to provide one. To tackle the challenge of deciding whether each incoming pair source–translations is worthy to query for human feedback, we resort to a number of stream-based active learning query strategies. Moreover, because we do not know in advance which query strategy will be the most adequate for a certain language pair and set of Machine Translation models, we propose to dynamically combine multiple strategies using prediction with expert advice. Our experiments on different language pairs and feedback settings show that using active learning allows us to converge on the best Machine Translation systems with fewer human interactions. Furthermore, combining multiple strategies using prediction with expert advice outperforms several individual active learning strategies with even fewer interactions, particularly in partial feedback settings.


Reflection of Demographic Background on Word Usage

Aparna Garimella, Adobe Research, Adobe Big Data Experience Lab

Carmen Banea, University of Michigan, Computer Science and Engineering

Rada Mihalcea, University of Michigan, Computer Science and Engineering

AbstractThe availability of personal writings in electronic format provides researchers in the fields of linguistics, psychology, and computational linguistics with an unprecedented chance to study, on a large scale, the relationship between language use and the demographic background of writers, allowing us to better understand people across different demographics. In this article, we analyze the relation between language and demographics by developing cross-demographic word models to identify words with usage bias, or words that are used in significantly different ways by speakers of different demographics. Focusing on three demographic categories, namely, location, gender, and industry, we identify words with significant usage differences in each category and investigate various approaches of encoding a word’s usage, allowing us to identify language aspects that contribute to the differences. Our word models using topic-based features achieve at least 20% improvement in accuracy over the baseline for all demographic categories, even for scenarios with classification into 15 categories, illustrating the usefulness of topic-based features in identifying word usage differences. Further, we note that for location and industry, topics extracted from immediate context are the best predictors of word usages, hinting at the importance of word meaning and its grammatical function for these demographics, while for gender, topics obtained from longer contexts are better predictors for word usage.


Certified Robustness to Text Adversarial Attacks by Randomized [MASK]

Jiehang Zeng, Fudan University, School of Computer Science

Jianhan Xu, Fudan University, School of Computer Science

Xiaoqing Zheng, Fudan University, School of Computer Science

Abstract Very recently, few certified defense methods have been developed to provably guarantee the robustness of a text classifier to adversarial synonym substitutions. However, all the existing certified defense methods assume that the defenders have been informed of how the adversaries generate synonyms, which is not a realistic scenario. In this study, we propose a certifiably robust defense method by randomly masking a certain proportion of the words in an input text, in which the above unrealistic assumption is no longer necessary. The proposed method can defend against not only word substitution-based attacks, but also character-level perturbations. We can certify the classifications of over 50% of texts to be robust to any perturbation of five words on AGNEWS, and two words on SST2 dataset. The experimental results show that our randomized smoothing method significantly outperforms recently proposed defense methods across multiple datasets under different attack algorithms.


The Analysis of Synonymy and Antonymy in Discourse Relations: An Interpretable Modeling Approach

Asela Reig Alamillo, Universidad Autónoma del Estado de Morelos, Centro de Investigación en Ciencias Cognitivas

David Torres Moreno, Universidad Autónoma del Estado de Morelos, Centro de Investigación en Ciencias

Eliseo Morales González, Universidad Autónoma del Estado de Morelos, Centro de Investigación en Ciencias

Abstract The idea that discourse relations are interpreted both by explicit content and by shared knowledge between producer and interpreter is pervasive in discourse and linguistic studies. How much weight should be ascribed in this process to the lexical semantics of the arguments is, however, uncertain. We propose a computational approach to analyze contrast and concession relations in the PDTB corpus. Our work sheds light on the question of how much lexical relations contribute to the signaling of such explicit and implicit relations, as well as on the contribution of different parts of speech to these semantic relations. This study contributes to bridging the gap between corpus and computational linguistics by proposing transparent and explainable computational models of discourse relations based on the synonymy and antonymy of their arguments.


From Word Types to Tokens and Back: A Survey of Approaches to Word Meaning Representation and Interpretation

Marianna Apidianaki, University of Pennsylvania, Department of Computer and Information Science

Abstract Vector-based word representation paradigms situate lexical meaning at different levels of abstraction. Distributional and static embedding models generate a single vector per word type, which is an aggregate across the instances of the word in a corpus. Contextual language models, on the contrary, directly capture the meaning of individual word instances. The goal of this survey is to provide an overview of word meaning representation methods, and of the strategies that have been proposed for improving the quality of the generated vectors. These often involve injecting external knowledge about lexical semantic relationships, or refining the vectors to describe different senses. The survey also covers recent approaches for obtaining word type-level representations from token-level ones, and for combining static and contextualized representations. Special focus is given to probing and interpretation studies aimed at discovering the lexical semantic knowledge that is encoded in contextualized representations. The challenges posed by this exploration have motivated the interest towards static embedding derivation from contextualized embeddings, and for methods aimed at improving the similarity estimates that can be drawn from the space of contextual language models.



Comparing Selective Masking Methods for Depression Detection in Social Media

Chanapa Pananookooln, Asian Institute of Technology, Department of Information and, Communications Technologies, School of Engineering and Technology.

Jakrapop Akaranee, Asian Institute of Technology, Department of Information and, Communications Technologies, School of Engineering and Technology.

Chaklam Silpasuwanchai, Asian Institute of Technology Department of Information and Communications Technologies, School of Engineering and Technology.


Abstract Identifying those at risk for depression is a crucial issue and social media provides an excellent platform for examining the linguistic patterns of depressed individuals. A significant challenge in depression classification problems is ensuring that prediction models are not overly dependent on topic keywords (i.e., depression keywords) such that it fails to predict when such keywords are unavailable. One promising approach is masking—that is, by selectively masking various words and asking the model to predict the masked words, the model is forced to learn the inherent language patterns of depression. This study evaluates seven masking techniques. Moreover, predicting the masked words during the pre-training or fine-tuning phase was also examined. Last, six class imbalanced ratios were compared to determine the robustness of masked words selection methods. Key findings demonstrate that selective masking outperforms random masking in terms of F1-score. The most accurate and robust models are identified. Our research also indicates that reconstructing the masked words during the pre-training phase is more advantageous than during the fine-tuning phase. Further discussion and implications are discussed. This is the first study to comprehensively compare masked words selection methods, which has broad implications for the field of depression classification and general NLP. Our code can be found at: https://github.com/chanapapan/Depression-Detection.


Comparing Selective Masking Methods for Depression Detection in Social Media


Chanapa Pananookooln, Asian Institute of Technology, Department of Information and, Communications Technologies, School of Engineering and Technology.

Jakrapop Akaranee, Asian Institute of Technology, Department of Information and, Communications Technologies, School of Engineering and Technology.

Chaklam Silpasuwanchai, Asian Institute of Technology Department of Information and Communications Technologies, School of Engineering and Technology.


Abstract Identifying those at risk for depression is a crucial issue and social media provides an excellent platform for examining the linguistic patterns of depressed individuals. A significant challenge in depression classification problems is ensuring that prediction models are not overly dependent on topic keywords (i.e., depression keywords) such that it fails to predict when such keywords are unavailable. One promising approach is masking—that is, by selectively masking various words and asking the model to predict the masked words, the model is forced to learn the inherent language patterns of depression. This study evaluates seven masking techniques. Moreover, predicting the masked words during the pre-training or fine-tuning phase was also examined. Last, six class imbalanced ratios were compared to determine the robustness of masked words selection methods. Key findings demonstrate that selective masking outperforms random masking in terms of F1-score. The most accurate and robust models are identified. Our research also indicates that reconstructing the masked words during the pre-training phase is more advantageous than during the fine-tuning phase. Further discussion and implications are discussed. This is the first study to comprehensively compare masked words selection methods, which has broad implications for the field of depression classification and general NLP. Our code can be found at: https://github.com/chanapapan/Depression-Detection.


Cross-Lingual Transfer with Language-Specific Subnetworks for Low-Resource Dependency Parsing


Rochelle Choenni, University of Amsterdam, The Institute for Logic, Language and Computation (ILLC).

Dan Garrette, Google Research.

Ekaterina Shutova, University of Amsterdam, The Institute for Logic, Language and Computation (ILLC).


Abstract Large multilingual language models typically share their parameters across all languages, which enables cross-lingual task transfer, but learning can also be hindered when training updates from different languages are in conflict. In this article, we propose novel methods for using language-specific subnetworks, which control cross-lingual parameter sharing, to reduce conflicts and increase positive transfer during fine-tuning. We introduce dynamic subnetworks, which are jointly updated with the model, and we combine our methods with meta-learning, an established, but complementary, technique for improving cross-lingual transfer. Finally, we provide extensive analyses of how each of our methods affects the models.


Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model

hris van der Lee, Tilburg University, Tilburg Center for Cognition and Communication

Thiago Castro Ferreira, Universidade Federal de Minas Gerais , Faculdade de Letras.

Chris Emmery, Tilburg University, Department of Cognitive Science and Artificial Intelligence.

Travis J. Wiltshire, Tilburg University, Department of Cognitive Science and Artificial Intelligence.

Emiel Krahmer, Tilburg University, Tilburg Center for Cognition and Communication.


Abstract This study discusses the effect of semi-supervised learning in combination with pretrained language models for data-to-text generation. It is not known whether semi-supervised learning is still helpful when a large-scale language model is also supplemented. This study aims to answer this question by comparing a data-to-text system only supplemented with a language model, to two data-to-text systems that are additionally enriched by a data augmentation or a pseudo-labeling semi-supervised learning approach.Results show that semi-supervised learning results in higher scores on diversity metrics. In terms of output quality, extending the training set of a data-to-text system with a language model using the pseudo-labeling approach did increase text quality scores, but the data augmentation approach yielded similar scores to the system without training set extension. These results indicate that semi-supervised learning approaches can bolster output quality and diversity, even when a language model is also present.


Statistical Methods for Annotation Analysis

Rodrigo Wilkens, Université catholique de Louvain


Abstract A common task in Natural Language Processing (NLP) is the development of datasets/corpora. It is the crucial initial step for initiatives aiming to train and evaluate Machine Learning and AI systems, for example. Often, these resources must be annotated with additional information (e.g., part-of-speech and named entities), which leads to the question of how to obtain these values. One of the most natural and widely used approaches is to ask for people (e.g., from untrained annotators to domain experts) to identify this information in a given text or document and possibly for more than one annotator per item. However, this is an incomplete solution. It is still necessary to obtain a final annotation per item and to measure agreement among the different annotators (or coders).


Grammatical Error Correction: A Survey of the State of the Art


Christopher Bryant, ALTA Institute, Department of Computer Science and Technology, University of Cambridge.

Zheng Yuan, Department of Informatics, King’s College London.

Muhammad Reza Qorib, Department of Informatics, King’s College London.

Hannan Cao, Department of Computer Science, National University of Singapore. 

Hwee Tou Ng, Department of Computer Science, National University of Singapore. 

Ted Briscoe, Department of Computer Science, National University of Singapore. 


Abstract Grammatical Error Correction (GEC) is the task of automatically detecting and correcting errors in text. The task not only includes the correction of grammatical errors, such as missing prepositions and mismatched subject–verb agreement, but also orthographic and semantic errors, such as misspellings and word choice errors, respectively. The field has seen significant progress in the last decade, motivated in part by a series of five shared tasks, which drove the development of rule-based methods, statistical classifiers, statistical machine translation, and finally neural machine translation systems, which represent the current dominant state of the art. In this survey paper, we condense the field into a single article and first outline some of the linguistic challenges of the task, introduce the most popular datasets that are available to researchers (for both English and other languages), and summarize the various methods and techniques that have been developed with a particular focus on artificial error generation. We next describe the many different approaches to evaluation as well as concerns surrounding metric reliability, especially in relation to subjective human judgments, before concluding with an overview of recent progress and suggestions for future work and remaining challenges. We hope that this survey will serve as a comprehensive resource for researchers who are new to the field or who want to be kept apprised of recent developments.


Machine Learning for Ancient Languages: A Survey 

Thea Sommerschield, Ca’ Foscari University of Venice, Department of Humanities. 

Yannis Assael, Google DeepMind

John Pavlopoulos, Athens University of Economics and Business

Vanessa Stefanak, Google DeepMind

Andrew Senior, Google DeepMind

Chris Dyer, Google DeepMind

John Bodel, Brown University Classics Faculty

Jonathan Prag, University of Oxford, Faculty of Classics

Ion Androutsopoulos, Athens University of Economics and Business, Department of Informatics

Nando de Freitas, Google DeepMind


Abstract Ancient languages preserve the cultures and histories of the past. However, their study is fraught with difficulties, and experts must tackle a range of challenging text-based tasks, from deciphering lost languages to restoring damaged inscriptions, to determining the authorship of works of literature. Technological aids have long supported the study of ancient texts, but in recent years advances in artificial intelligence and machine learning have enabled analyses on a scale and in a detail that are reshaping the field of humanities, similarly to how microscopes and telescopes have contributed to the realm of science. This article aims to provide a comprehensive survey of published research using machine learning for the study of ancient texts written in any language, script, and medium, spanning over three and a half millennia of civilizations around the ancient world. To analyze the relevant literature, we introduce a taxonomy of tasks inspired by the steps involved in the study of ancient documents: digitization, restoration, attribution, linguistic analysis, textual criticism, translation, and decipherment. This work offers three major contributions: first, mapping the interdisciplinary field carved out by the synergy between the humanities and machine learning; second, highlighting how active collaboration between specialists from both fields is key to producing impactful and compelling scholarship; third, highlighting promising directions for future work in this field. Thus, this work promotes and supports the continued collaborative impetus between the humanities and machine learning.


Measuring Attribution in Natural Language Generation Models

Hannah Rashkin, Google DeepMind

Vitaly Nikolaev, Google DeepMind

Matthew Lamm, Google DeepMind

Lora Aroyo, Google Research

Michael Collins, Google DeepMind

Dipanjan Das, Google DeepMind

Slav Petrov, Google DeepMind

Gaurav Singh Tomar, Google DeepMind

Iulia Turc, Storia AI

David Reitter, Google DeepMind


Abstract Large neural models have brought a new challenge to natural language generation (NLG): It has become imperative to ensure the safety and reliability of the output of models that generate freely. To this end, we present an evaluation framework, Attributable to Identified Sources (AIS), stipulating that NLG output pertaining to the external world is to be verified against an independent, provided source. We define AIS and a two-stage annotation pipeline for allowing annotators to evaluate model output according to annotation guidelines. We successfully validate this approach on generation datasets spanning three tasks (two conversational QA datasets, a summarization dataset, and a table-to-text dataset). We provide full annotation guidelines in the appendices and publicly release the annotated data at https://github.com/google-research-datasets/AIS.


Capturing Fine-Grained Regional Differences in Language Use through Voting Precinct Embeddings

Alex Rosenfeld, Leidos, Innovations Center.

Lars Hinrichs, The University of Texas at Austin, Department of English


Abstract Linguistic variation across a region of interest can be captured by partitioning the region into areas and using social media data to train embeddings that represent language use in those areas. Recent work has focused on larger areas, such as cities or counties, to ensure that enough social media data is available in each area, but larger areas have a limited ability to find fine-grained distinctions, such as intracity differences in language use. We demonstrate that it is possible to embed smaller areas, which can provide higher resolution analyses of language variation. We embed voting precincts, which are tiny, evenly sized political divisions for the administration of elections. The issue with modeling language use in small areas is that the data becomes incredibly sparse, with many areas having scant social media data. We propose a novel embedding approach that alternates training with smoothing, which mitigates these sparsity issues. We focus on linguistic variation across Texas as it is relatively understudied. We develop two novel quantitative evaluations that measure how well the embeddings can be used to capture linguistic variation. The first evaluation measures how well a model can map a dialect given terms specific to that dialect. The second evaluation measures how well a model can map preference of lexical variants. These evaluations show how embedding models could be used directly by sociolinguists and measure how much sociolinguistic information is contained within the embeddings. We complement this second evaluation with a methodology for using embeddings as a kind of genetic code where we identify “genes” that correspond to a sociological variable and connect those “genes” to a linguistic phenomenon thereby connecting sociological phenomena to linguistic ones. Finally, we explore approaches for inferring isoglosses using embeddings.


Generation and Polynomial Parsing of Graph Languages with Non-Structural Reentrancies 

Johanna Björklund, Department of Computing Science, Umeå University

Frank Drewes, Department of Computing Science, Umeå University

Anna Jonsson, Department of Computing Science, Umeå University


Abstract Graph-based semantic representations are popular in natural language processing, where it is often convenient to model linguistic concepts as nodes and relations as edges between them. Several attempts have been made to find a generative device that is sufficiently powerful to describe languages of semantic graphs, while at the same allowing efficient parsing. We contribute to this line of work by introducing graph extension grammar, a variant of the contextual hyperedge replacement grammars proposed by Hoffmann et al. Contextual hyperedge replacement can generate graphs with non-structural reentrancies, a type of node-sharing that is very common in formalisms such as abstract meaning representation, but that context-free types of graph grammars cannot model. To provide our formalism with a way to place reentrancies in a linguistically meaningful way, we endow rules with logical formulas in counting monadic second-order logic. We then present a parsing algorithm and show as our main result that this algorithm runs in polynomial time on graph languages generated by a subclass of our grammars, the so-called local graph extension grammars.


Languages Through the Looking Glass of BPE Compression 

Ximena Gutierrez-Vasques, University of Zürich, URPP Language and Space

Christian Bentz, University of Tübingen, Department of General Linguistics

Tanja Samardžić, University of Zürich, URPP Language and Space


Abstract Graph-based semantic representations are popular in natural language processing, where it is often convenient to model linguistic concepts as nodes and relations as edges between them. Several attempts have been made to find a generative device that is sufficiently powerful to describe languages of semantic graphs, while at the same allowing efficient parsing. We contribute to this line of work by introducing graph extension grammar, a variant of the contextual hyperedge replacement grammars proposed by Hoffmann et al. Contextual hyperedge replacement can generate graphs with non-structural reentrancies, a type of node-sharing that is very common in formalisms such as abstract meaning representation, but that context-free types of graph grammars cannot model. To provide our formalism with a way to place reentrancies in a linguistically meaningful way, we endow rules with logical formulas in counting monadic second-order logic. We then present a parsing algorithm and show as our main result that this algorithm runs in polynomial time on graph languages generated by a subclass of our grammars, the so-called local graph extension grammars.


Language Embeddings Sometimes Contain Typological Generalizations

Robert Östling, Stockholm University, Department of Linguistics.

Murathan Kurfalı, Stockholm University, Department of Psychology


Abstract To what extent can neural network models learn generalizations about language structure, and how do we find out what they have learned? We explore these questions by training neural models for a range of natural language processing tasks on a massively multilingual dataset of Bible translations in 1,295 languages. The learned language representations are then compared to existing typological databases as well as to a novel set of quantitative syntactic and morphological features obtained through annotation projection. We conclude that some generalizations are surprisingly close to traditional features from linguistic typology, but that most of our models, as well as those of previous work, do not appear to have made linguistically meaningful generalizations. Careful attention to details in the evaluation turns out to be essential to avoid false positives. Furthermore, to encourage continued work in this field, we release several resources covering most or all of the languages in our data: (1) multiple sets of language representations, (2) multilingual word embeddings, (3) projected and predicted syntactic and morphological features, (4) software to provide linguistically sound evaluations of language representations.


期刊简介

Computational Linguistics is the longest-running publication devoted exclusively to the computational and mathematical properties of language and the design and analysis of natural language processing systems. This highly regarded quarterly offers university and industry linguists, computational linguists, artificial intelligence and machine learning investigators, cognitive scientists, speech specialists, and philosophers the latest information about the computational aspects of all the facets of research on language.


计算语言学是专门研究语言的计算和数学特性以及自然语言处理系统的设计和分析的历史最悠久的出版物。本刊为季刊,并且广受好评。为大学和工业界的语言学家、计算语言学家、人工智能和机器学习研究者、认知科学家、语音专家和哲学家提供关于语言研究所有方面的计算方面的最新信息。


官网地址:

https://direct.mit.edu/coli

本文来源:COMPUTATIONAL LINGUISTICS官网

点击文末“阅读原文”可跳转官网



课程推荐




刊讯|《汉语教学学刊》2023年第18辑

2024-05-24

刊讯|SSCI 期刊 《写作评估》2023年第55-58卷

2024-05-23

刊讯|《世界华文教学》2023年第12辑

2024-05-22

刊讯|《语言教学与研究》2024年第3期

2024-05-20

刊讯|《海外华文教育》2023年第2-3期

2024-05-19

刊讯|《外语导刊》2024年第1-2期(留言赠刊)

2024-05-18

刊讯|《汉语语言学》2023年第4辑

2024-05-17

刊讯|《汉语国际教育学报》2023年第14辑

2024-05-16

刊讯|CSSCI 扩展版《华文教学与研究》2024年第1-2期

2024-05-15

刊讯|《语言规划学研究》2023年第13辑

2024-05-14

刊讯|SSCI 期刊《英语语言教学》2023年第1-4期

2024-05-13

刊讯|CSSCI 来源集刊《南开语言学刊》2024年第1期

2024-05-11


欢迎加入
“语言学心得交流分享群”“语言学考博/考研/保研交流群”


请添加“心得君”入群务必备注“学校/单位+研究方向/专业”

  今日小编:有     常

  审     核:心得小蔓

转载&合作请联系

"心得君"

微信:xindejun_yyxxd

点击“阅读原文”可跳转下载

继续滑动看下一个
语言学心得
向上滑动看下一个

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存