句向量表示学习在自然语言处理(NLP)领域占据重要地位,许多NLP任务的成功离不开训练优质的句子表示向量。特别是在文本语义匹配(Semantic Textual Similarity)、文本向量检索(Dense Text Retrieval)等任务上,模型通过计算两个句子编码后的Embedding在表示空间的相似度来衡量这两个句子语义上的相关程度,从而决定其匹配分数。
有监督的句子表征学习方法:早期的工作发现自然语言推理(Natural Language Inference,NLI)任务对语义匹配任务有较大的帮助,他们使用BiLSTM编码器,融合了两个NLI的数据集SNLI和MNLI进行训练。Universal Sentence Encoder(USE)使用了基于Transformer的架构,并使用SNLI对无监督训练进行增强。SBERT进一步使用了一个共享的预训练的BERT编码器对两个句子进行编码,在NLI数据集上进行训练(Fine-tune)。
[1] Reimers, Nils, and Iryna Gurevych. "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks." Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.[2] Li, Bohan, et al. "On the Sentence Embeddings from Pre-trained Language Models." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020.[3] Gao, Jun, et al. "Representation Degeneration Problem in Training Natural Language Generation Models." International Conference on Learning Representations. 2018.[4] Wang, Lingxiao, et al. "Improving Neural Language Generation with Spectrum Control." International Conference on Learning Representations. 2019.[5] Conneau, Alexis, et al. "Supervised Learning of Universal Sentence Representations from Natural Language Inference Data." Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017.[6] Cer, Daniel, et al. "Universal Sentence Encoder for English." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 2018.[7] Wang, Shuohang, et al. "Cross-Thought for Sentence Encoder Pre-training." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020.[8] Yang, Ziyi, et al. "Universal Sentence Representation Learning with Conditional Masked Language Model." arXiv preprint arXiv:2012.14388 (2020).[9] Lee, Haejun, et al. "SLM: Learning a Discourse Language Representation with Sentence Unshuffling." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020.[10] Su, Jianlin, et al. "Whitening sentence representations for better semantics and faster retrieval." arXiv preprint arXiv:2103.15316 (2021).[11] Gao, Tianyu, Xingcheng Yao, and Danqi Chen. "SimCSE: Simple Contrastive Learning of Sentence Embeddings." arXiv preprint arXiv:2104.08821 (2021).[12] Wu, Xing, et al. "Conditional bert contextual augmentation." International Conference on Computational Science. Springer, Cham, 2019.[13] Zhou, Wangchunshu, et al. "BERT-based lexical substitution." Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.[14] He, Kaiming, et al. "Momentum contrast for unsupervised visual representation learning." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.[15] Chen, Ting, et al. "A simple framework for contrastive learning of visual representations." International conference on machine learning. PMLR, 2020.[16] Zhang, Yan, et al. "An Unsupervised Sentence Embedding Method by Mutual Information Maximization." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020.[17] Fang, Hongchao, et al. "Cert: Contrastive self-supervised learning for language understanding." arXiv preprint arXiv:2005.12766 (2020).[18] Carlsson, Fredrik, et al. "Semantic re-tuning with contrastive tension." International Conference on Learning Representations. 2021.[19] Giorgi, John M., et al. "Declutr: Deep contrastive learning for unsupervised textual representations." arXiv preprint arXiv:2006.03659 (2020).[20] Wu, Zhuofeng, et al. "CLEAR: Contrastive Learning for Sentence Representation." arXiv preprint arXiv:2012.15466(2020).