We introduce the task of question-in-context rewriting: given the context of a conversation’s history, rewrite a context-dependent into a selfcontained question with the same answer.
M 的混合解析 一个词的语义和它在 X 中的位置都将被一个经过适当训练的编码器 RNN 编码到 M 中的隐藏状态中。 在生成模式下,attentive read 占据主导,其主要由受到语义信息和语言模型来驱动,因此阅读 M 上的信息时位置更加跳跃自由;在拷贝模式下,对 M 的 selective read 往往受到位置信息的引导,从而采取非跳跃式僵化移动(rigid move)的做法,往往涵盖连续的多个词,包括未知词。 位置信息的更新方式如下:
How to Fix : Easier Copying with Pointer-Generator Networks:为了解决 Problem 1(inaccurate copying),提出了 pointer-generator network。这是一个混合网络,可以选择通过 point 从原本复制单词,同时保留从 fixed vocabulary 生成单词的能力。具体模型构造可以概括为计算 attention distribution ,vocabulary distribution ,generation probability :
生成单词 w 的概率 = 从词汇表生成的概率 + 复制原文某一个处文本的概率 Eliminating Repetition with Coverage:为了解决 Problem 2(repetitive summaries),提出 coverage 方法使用 attention distribution 记录覆盖率,为重复部分添加惩罚机制。
因为 max segment len 的存在,对于超过 max segment len 的文档,采用以下两种变体作为处理方案: Independent:超过 max segment len 就截断 Overlap:超过 max segment len 的文档,通过滑动窗口的形式生成多个片段。再通过 f 拼接起来:
Context layer : GloVe + BiLSTM (c and x are jointly encoded)
Encoding layer (concentrate on local rather than global information) : concatenating 1. element-wise similarity (Ele Sim.) 2. cosine similarity (Cos Sim.) and 3. learned bi-linear similarity (Bi-Linear Sim.) -> D-dimensional feature vector
Segmentation layer (To capture global information) : Conv + pool + skip connect + ffn ->
上述架构同样可以使用基于预训练模型(e.g. BERT, etc)获取分布式表征信息, RUN + BERT 表现出相较 RUN 更有说服力的实验结果。 生成前需要进行一步 standardization确保所有的 Y 都是长方形,based on Hoshen–Kopelman,并添加 Connection Words 保证句子流畅度。
baseline model 来自 Zheng et al.(2018)的 seq2seq neural network model。
AAAI 2020 [18]
官方博客:
https://xiyuanzh.github.io/projects/AAAI2020.html
Code and Data:https://gitlab.com/ucdavisnlp/filling-conversation-ellipsis 省略(ellipsis)现象在日常对话当中十分常见,ellipsis 的存在也增加了对话角色预测(Dialog Act Prediction),语义角色标注(Semantic Role Labeling)等下游任务的难度。通常会采用自动补全的方法来填充 ellipsis,但是自动补全的话语可能会重复或错过一些单词,甚至可能产生无意义的句子。本文提出了一种混合模型 Hybrid-ELlipsis-CoMPlete(Hybrid-EL-CMP),充分考量带有 original utterance with ellipsis 以及自动补全的 utterance,以提高语言理解能力。
腾讯 AI 三杀了。这篇文章指出,上述前序工作的 decoder 主要使用 global attention 从而关注对话语境中的所有单词。由于没有预先引入先验焦点(prior focus),上述 atention 机制的注意力可能会对一些无关紧要的字词吸引。 很自然地可以想到——使用 Semantic Role Labeling(SRL)识别句子的谓词-实参(Predicate-Argument)结构,从而捕获 who did what to whom 这样的语义信息作为先验辅助 decoder。下图这个例子很好地体现了语义成分在 Utterenace 中起到的作用,“粤语”和“普通话”被识别为两个不同的实参,可以在 SRL 的指导下获得更多的关注。
腾讯 AI 的标注团队从 Duconv(Wu et al., 2019)数据集上,标注了 3000 个 dialogue sessions,包含 33,673 个谓词,27,198 个话语,选用了一个额外的 BERT-based SRL model(Shi and Lin, 2019)作为 SRL parser 用来完成这个谓词-实参结构的识别任务,并且在 CoNLL 2012(117,089 examples)进行初步预训练。
Input Representation 还是 BERT 经典三件套 Toekn Embedding。Segment Type Embeddings 和 Position Embeddings, 用以区分 Speakers。Input 由 PA structures,dialog context,and rewritten utterance 三者拼接而成,其中 PA structures 由本质上的以谓词为根,语义实参为叶子的根转为换的线性三元组,并以随机顺序拼接。 这样以随机顺序 i 拼接的机制可能会对 sequence encoder 带来一定的干扰信息。这边引入附加在 PA sequence 上的 bidirectional attention mask 机制来辅助,即不同 PA 三元组中的 token 不能相互 attend。同时 PA 三元组的 position embedding 使用各个三元组的独立位置信息。 在 REWRITE(Su et al., 2019)数据集上进行实验对比一下结果,BERT 指代 RoBERTa Chinese。复现结果与 Su et al.,(2019)不太一致。
Gold-SRL:使用 SRL 的 golden label 作为输入,说明 SRL model 的质量高度影响 rewriter。
在 Restoration-200K(Pan et al., 2019 [7])数据集上实验结果如下,还不如Pan et al.,(2019)[7] 提出的 PAC Model。
we find that the SRL information mainly improves the performance on the dialogues that require information completion. One omitted information is considered as properly completed if the rewritten utterance recovers the omitted words. We find the SRL parser naturally offers important guidance into the selection of omitted words.
EACL 2021 [21]
文章把省略补充问题和指代消解问题转化为 QA 问题,如下 query 可以转化为的三元组:
context: the entire document
question: the sentence in which the ellipsis/mention is present
answer: the antecedent/entity
对于指代消解问题,如果一个句子包含 n 个 mentions,就转为成为 n 个 questions 其中 resolution 用包裹。 P.S. Ellipsis 被分为 Sluice Ellipsis 和 Verb Phrase Ellipsis,分别有不同的公开数据集。
QA Architectures 选取三种不同的 encoder module:
DrQA (Chen et al., 2017), LSTM
QANet (Yu et al., 2018), CNN
BERT (Devlin et al., 2018), pre-trained Transformer
[1] [EMNLP 2019] Can You Unpack That? Learning to Rewrite Questions-in-Context (Elgohary et al., 2019) https://aclanthology.org/D19-1605/
[2] [COLING 2016] Non-sentential Question Resolution using Sequence to Sequence Learning (Kumar and Joshi, 2016) https://aclanthology.org/C16-1190/
[3] [SIGIR 2017] Incomplete Follow-up Question Resolution using Retrieval based Sequence to Sequence Learning (Kumar and Joshi, 2017) https://dl.acm.org/doi/10.1145/3077136.3080801
[4] abc[ACL 2016] Incorporating Copying Mechanism in Sequence-to-Sequence Learning (Gu et al., 2016) https://arxiv.org/abs/1603.06393
[5] abcd[ACL 2017] Get To The Point: Summarization with Pointer-Generator Networks (See et al., 2017) https://arxiv.org/abs/1704.04368
[6] [ACL 2019] Improving Multi-turn Dialogue Modelling with Utterance ReWriter (Su et al., 2019) https://aclanthology.org/P19-1003/
[7] abcd[EMNLP 2019] Improving Open-Domain Dialogue Systems via Multi-Turn Incomplete Utterance Restoration (Pan et al., 2019) https://aclanthology.org/D19-1191/
[8] [AAAI 2019] FANDA: A Novel Approach to Perform Follow-Up Query Analysis (Liu et al., 2019a) https://arxiv.org/abs/1901.08259
[9] [EMNLP 2019] A Split-and-Recombine Approach for Follow-up Query Analysis (Liu et al., 2019b) https://arxiv.org/abs/1909.08905
[10] [NAACL 2018 Short] Higher-order coreference resolution with coarse-tofine inference (Lee et al., 2018) https://aclanthology.org/N18-2108/
[11] [EMNLP 2019 Short] BERT for Coreference Resolution: Baselines and Analysis (Joshi et al., 2019) https://aclanthology.org/D19-1588/
[12] [EMNLP 2020] Incomplete Utterance Rewriting as Semantic Segmentation (Liu et al., 2020) https://aclanthology.org/2020.emnlp-main.227/
[13] [EMNLP 2019] Unsupervised Context Rewriting for Open Domain Conversation (Zhou et al., 2019) https://arxiv.org/abs/1910.08282
[14] [ACL 2020] CorefQA: Coreference Resolution as Query-based Span Prediction (Wu et al., 2020) https://aclanthology.org/2020.acl-main.622/
[15] [NAACL 2019] Scaling Multi-Domain Dialogue State Tracking via Query Reformulation (Rastogi et al., 2019) https://aclanthology.org/N19-2013/
[16] [EMNLP 2019] GECOR: An End-to-End Generative Ellipsis and Co-reference Resolution Model for Task-Oriented Dialogue (Quan et al., 2020) https://arxiv.org/abs/1909.12086
[17] [ACL 2018] Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures https://aclanthology.org/P18-1133/
[18] [AAAI 2020] Filling Conversation Ellipsis for Better Social Dialog Understanding (Zhang et al., 2020) https://arxiv.org/abs/1911.10776
[19] [ACL 2020] Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation (Song et al., 2020) https://aclanthology.org/2020.acl-main.516/
[20] ab[EMNLP 2020] Semantic role labeling guided multi-turn dialogue rewriter (Xu et al., 2020) https://arxiv.org/abs/2010.01417
[21] [EACL 2021] Ellipsis Resolution as Question Answering: An Evaluation (Aralikatte et al., 2021) https://aclanthology.org/2021.eacl-main.68/
[22] [WSDM 2021] Question Rewriting for Conversational Question Answering (Vakulenko et al., 2021) https://arxiv.org/abs/2004.14652
[23] [ArXiv 2020] Robust Dialogue Utterance Rewriting as Sequence Tagging (Hao et al., 2020) https://arxiv.org/abs/2012.14535
[24] [ArXiv 2020] MLR: A Two-stage Conversational Query Rewriting Model with Multi-task Learning (Song et al., 2020) https://arxiv.org/abs/2004.05812