NeurlPS 2022 | 自然语言处理相关论文分类整理
本文从NeurlPS 2022 的2000多篇接收论文中筛选出了与自然语言处理相关的论文200多篇,并按照研究主题进行分类整理,以供参考。文章也同步发布在 AI Box 知乎专栏(知乎搜索 AI Box 专栏),欢迎大家在知乎专栏的文章下方评论留言,交流探讨!
导读:
NeurIPS 2022 是 CCF A 类会议,人工智能领域方向的顶级国际会议之一。第36届神经信息处理系统会议将于今年 11 月 28 日至 12 月 9 日举行。官方发布的接收论文列表链接如下:https://nips.cc/Conferences/2022/Schedule?type=Poster。
本文从 2000 多篇接收论文中筛选出了与自然语言处理相关的论文 200 多篇,并按照研究主题进行分类整理,以供参考。论文列表也同步更新到 GitHub,欢迎大家关注和Star:github.com/RUCAIBox/Top-conference-paper-list。
目录:
Model 【模型】
Interpretability, Analysis and Evaluation 【可解释性、分析、评测】
Robustness and Safety 【鲁棒性与安全】
knowledge and reasoning 【知识与推理】
Information Extraction 【信息抽取】
Information Retrieval 【信息检索】
Text Classification 【文本分类】
Text Generation 【文本生成】
Machine Translation and Multilinguality 【机器翻译与多语言】
Multimodality 【多模态】
Special Tasks 【特殊任务】
01
Model
【模型】
1. Model Design 【模型设计】
Recurrent Memory Transformer
Jump Self-attention: Capturing High-order Statistics in Transformers
Block-Recurrent Transformers
Staircase Attention for Recurrent Processing of Sequences
Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings
Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling
Mixture-of-Experts with Expert Choice Routing
On the Representation Collapse of Sparse Mixture of Experts
Improving Transformer with an Admixture of Attention Heads
Your Transformer May Not be as Powerful as You Expect
Confident Adaptive Language Modeling
Decoupled Context Processing for Context Augmented Language Modeling
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Revisiting Neural Scaling Laws in Language and Vision
Learning to Scaffold: Optimizing Model Explanations for Teaching
2. Model Compression 【模型压缩】
Information-Theoretic Generative Model Compression with Variational Energy-based Model
Towards Efficient Post-training Quantization of Pre-trained Language Models
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
Deep Compression of Pre-trained Transformer Models
LiteTransformerSearch: Training-free On-device Search for Efficient Autoregressive Language Models
GPT3.int8(): 8-bit Matrix Multiplication for Transformers at Scale
MorphTE: Injecting Morphology in Tensorized Embeddings
Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
A Fast Post-Training Pruning Framework for Transformers
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Generating Training Data with Language Models: Towards Zero-Shot Language Understanding
A Data-Augmentation Is Worth A Thousand Samples
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers
The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models
Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Training and Inference on Any-Order Autoregressive Models the Right Way
Decentralized Training of Foundation Models in Heterogeneous Environment
4. Model Usage 【模型使用】
The Unreliability of Explanations in Few-Shot In-Context Learning
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning
Training language models to follow instructions with human feedback
LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning
How to talk to your model: Instructions, descriptions, and learning
Data Distributional Properties Drive Emergent In-Context Learning in Transformers
Sparse Structure Search for Parameter-Efficient Tuning
Fine-Tuning Pre-Trained Language Models Effectively by Optimizing Subnetworks Adaptively
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
LIFT: Language-Interfaced FineTuning for Non-language Machine Learning Tasks
Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts
02
Interpretability, Analysis and Evaluation
【可解释性、分析、评测】
CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior
Rule-Based but Flexible? Evaluating and Improving Language Models as Accounts of Human Moral Judgment
Understanding the Failure of Batch Normalization for Transformers in NLP
AttCAT: Explaining Transformers via Attentive Class Activation Tokens
An empirical analysis of compute-optimal large language model training
Why GANs are overkill for NLP
Exploring Length Generalization in Large Language Models
Capturing Failures of Large Language Models via Human Cognitive Biases
Pre-Trained Model Reusability Evaluation for Small-Data Transfer Learning
First is Better Than Last for Language Data Influence
What are the best Systems? New Perspectives on NLP Benchmarking
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
FETA: Towards Specializing Foundational Models for Expert Task Applications
This is the way - lessons learned from designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish
Rethinking Knowledge Graph Evaluation Under the Open-World Assumption
A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction
03
Robustness and Safety
【鲁棒性与安全】
Active Learning Helps Pretrained Models Learn the Intended Task
Improving Certified Robustness via Statistical Learning with Logical Reasoning
Moderate-fitting as a Natural Backdoor Defender for Pre-trained Language Models
BadPrompt: Backdoor Attacks on Continuous Prompts
A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models
Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models
AD-DROP: Attribution Driven Dropout for Robust Language Model Finetuning
Large (robust) models from computational constraints
Multitasking Models are Robust to Structural Failure: A Neural Model for Bilingual Cognitive Reserve
A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks
Recovering Private Text in Federated Learning of Language Models
LAMP: Extracting Text from Gradients with Language Model Priors
SeqPATE: Differentially Private Text Generation via Knowledge Distillation
Differentially Private Model Compression
Federated Learning from Pre-Trained Models: A Contrastive Learning Approach
04
Knowledge and Reasoning
【知识与推理】
Learning to Sample and Aggregate: Few-shot Reasoning over Temporal Knowledge Graph
Retaining Knowledge for Learning with Dynamic Definition
Shadow Knowledge Distillation: Bridging Offline and Online Knowledge Transfer
What Makes a "Good" Data Augmentation in Knowledge Distillation - A Statistical Perspective
Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures
Roadblocks for Temporarily Disabling Shortcuts and Learning New Knowledge
PALBERT: Teaching ALBERT to Ponder
Locating and Editing Factual Associations in GPT
OTKGE: Multi-modal Knowledge Graph Embeddings via Optimal Transport
Large Language Models are Zero-Shot Reasoners
STaR: Bootstrapping Reasoning With Reasoning
Chain of Thought Prompting Elicits Reasoning in Large Language Models
ELASTIC: Numerical Reasoning with Adaptive Symbolic Compiler
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Inductive Logical Query Answering in Knowledge Graphs
Formalizing Coherence and Consistency Applied to Transfer Learning in Neuro-Symbolic Autoencoders
CoNSoLe: Convex Neural Symbolic Learning
Deep Bidirectional Language-Knowledge Pretraining
Neurosymbolic Deep Generative Models for Sequence Data with Relational Constraints
Instance-based Learning for Knowledge Base Completion
LogiGAN: Learning Logical Reasoning via Adversarial Pre-training
Learning robust rule representations for abstract reasoning via internal inferences
Solving Quantitative Reasoning Problems with Language Models
Towards Better Evaluation for Dynamic Link Prediction
Predictive Querying for Autoregressive Neural Sequence Models
Semantic Probabilistic Layers for Neuro-Symbolic Learning
End-to-end Symbolic Regression with Transformers
A Unified Framework for Deep Symbolic Regression
ZeroC: A Neuro-Symbolic Model for Zero-shot Concept Recognition and Acquisition at Inference Time
05
Information Extraction
【信息抽取】
Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model
TweetNERD - End to End Entity Linking Benchmark for Tweets
METS-CoV: A Dataset of Medical Entity and Targeted Sentiment on COVID-19 Related Tweets
06
Information Retrieval
【信息检索】
Transformer Memory as a Differentiable Search Index
Autoregressive Search Engines: Generating Substrings as Document Identifiers
A Neural Corpus Indexer for Document Retrieval
07
Text Classification
【文本分类】
CascadeXML: End-to-end Multi-Resolution Learning for Extreme Multi-Label Text Classification
Text Classification with Born's Rule
Public Wisdom Matters! Discourse-Aware Hyperbolic Fourier Co-Attention for Social Text Classification
08
Text Generation
【文本生成】
CoNT: Contrastive Neural Text Generation
A Character-Level Length Control Algorithm for Non-Autoregressive Sentence Summarization
Towards Improving Faithfulness in Abstractive Summarization
QUARK: Controllable Text Generation with Reinforced Unlearning
Teacher Forcing Recovers Reward Functions for Text Generation
Retrieve, Reason, and Refine: Generating Accurate and Faithful Patient Instructions
A Contrastive Framework for Neural Text Generation
Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation
COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics
Diffusion-LM Improves Controllable Text Generation
Factuality Enhanced Language Models for Open-Ended Text Generation
Controllable Text Generation with Neurally-Decomposed Oracle
InsNet: An Efficient, Flexible, and Performant Insertion-based Text Generation Model
Relation-Constrained Decoding for Text Generation
EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records
TGEA 2.0: A Large-Scale Diagnostically Annotated Dataset with Benchmark Tasks for Text Generation of Pretrained Language Models
09
Machine Translation and Multilinguality
【机器翻译与多语言】
Exploring Non-Monotonic Latent Alignments for Non-Autoregressive Machine Translation
A new dataset for multilingual keyphrase generation
Less-forgetting Multi-lingual Fine-tuning
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Refining Low-Resource Unsupervised Translation by Language Disentanglement of Multilingual Translation Model
OccGen: Selection of Real-world Multilingual Parallel Data Balanced in Gender within Occupations
Multilingual Abusive Comment Detection at Scale for Indic Languages
The BigScience Corpus A 1.6TB Composite Multilingual Dataset
Addressing Resource Scarcity across Sign Languages with Multilingual Pretraining and Unified-Vocabulary Datasets
10
Multimodality
【多模态】
REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
GLIPv2: Unifying Localization and Vision-Language Understanding
VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts
A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval
Egocentric Video-Language Pretraining
Flamingo: a Visual Language Model for Few-Shot Learning
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning
Deep Multi-Modal Structural Equations For Causal Effect Estimation With Unstructured Proxies
OmniVL: One Foundation Model for Image-Language and Video-Language Tasks
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
TVLT: Textless Vision-Language Transformer
Divert More Attention to Vision-Language Tracking
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval
BMU-MoCo: Bidirectional Momentum Update For Continual Video-Language Modeling
Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations
What is Where by Looking: Weakly-Supervised Open-World Phrase-Grounding without Text Inputs
Flamingo: a Visual Language Model for Few-Shot Learning
Self-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation
UniCLIP: Unified Framework for Contrastive Language-Image Pre-training
Contrastive Language-Image Pre-Training with Knowledge Graphs
PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model Pretraining
Enhancing and Scaling Cross-Modality Alignment for Contrastive Multimodal Pre-Training via Gradient Harmonization
Mutual Information Divergence: A Unified Metric for Multimodal Generative Models
Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching
MACK: Multimodal Aligned Conceptual Knowledge for Unpaired Image-text Matching
HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes
CyCLIP: Cyclic Contrastive Language-Image Pretraining
S-Prompts Learning with Pre-trained Transformers: An Occam’s Razor for Domain Incremental Learning
Delving into OOD Detection with Vision-Language Representations
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation
Relational Language-Image Pre-training for Human-Object Interaction Detection
Fine-Grained Semantically Aligned Vision-Language Pre-Training
Cross-Linked Unified Embedding for cross-modality representation learning
Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP
Kernel Multimodal Continuous Attention
Paraphrasing Is All You Need for Novel Object Captioning
Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders
One Model to Edit Them All: Free-Form Text-Driven Image Manipulation with Semantic Modulations
LGDN: Language-Guided Denoising Network for Video-Language Modeling
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models
VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
LAION-5B: An open large-scale dataset for training next generation image-text models
Towards Video Text Visual Question Answering: Benchmark and Baseline
TaiSu: A 166M Large-scale High-Quality Dataset for Chinese Vision-Language Pre-training
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
Understanding Aesthetics with Language: A Photo Critique Dataset for Aesthetic Assessment
Multi-modal Robustness Analysis Against Language and Visual Perturbations
CLiMB: A Continual Learning Benchmark for Vision-and-Language Tasks
OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression
11
Special Tasks
【特殊任务】
1. Code 【代码】
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
Fault-Aware Neural Code Rankers
NS3: Neuro-symbolic Semantic Code Search
Pyramid Attention For Source Code Summarization
2. Mathematics 【数学】
HyperTree Proof Search for Neural Theorem Proving
NaturalProver: Grounded Mathematical Proof Generation with Language Models
Autoformalization with Large Language Models
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers
3. Others 【其他】
Measuring and Reducing Model Update Regression in Structured Prediction for NLP
Learning to Follow Instructions in Text-Based Games
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
LISA: Learning Interpretable Skill Abstractions from Language
Inherently Explainable Reinforcement Learning in Natural Language
Using natural language and program abstractions to instill human inductive biases in machines
Semantic Exploration from Language Abstractions and Pretrained Representations
Pre-Trained Language Models for Interactive Decision-Making
Knowledge-Aware Bayesian Deep Topic Model
Improving Intrinsic Exploration with Language Abstractions
Improving Policy Learning via Language Dynamics Distillation
Meta-Complementing the Semantics of Short Texts in Neural Topic Models
Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset
BigBio: A Framework for Data-Centric Biomedical Natural Language Processing
更多推荐
ACL2023赶会必备,拿来即用之Experiments
预训练模型哪家强?提示迁移学习为文本生成提供新思路——NAACL 2022论文解读
扩散模型与其在文本生成图像领域的应用
↓↓↓