爱可可AI前沿推介(1.9)
LG - 机器学习 CV - 计算机视觉 CL - 计算与语言
1、[CL] Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes
2、[LG] StitchNet: Composing Neural Networks from Pre-Trained Fragments
3、[IR] InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval
4、[CV] Rethinking Mobile Block for Efficient Neural Models
5、[CV] Teaching Computer Vision for Ecology
摘要:通过监督推理过程改进科学问答、用预训练片段创建高性能神经网络、大型语言模型作为信息检索的高效数据集生成器、重新思考高效神经模型的移动端块
1、[CL] Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes
J Reppert, B Rachbach, C George, L Stebbing, J Byun, M Appleton, A Stuhlmüller
[Ought]
迭代分解:通过监督推理过程改进科学问答
要点:
回顾了过程监督文献,突出了工作流和工具方面的缺口,有助于现实世界的罕见用例; 迭代分解是一种人在环路的开发组合语言模型程序的工作流; ICE是一款开源的语言模型执行跟踪可视化工具; 采用该工作流在三个真实世界任务上改进了基线性能。
一句话总结:
迭代分解是通过人在环路的细化来提高组合语言模型程序性能的工作流,ICE是用于可视化这些程序执行跟踪的开源工具。该工作流程和工具改善了语言模型程序在三个真实世界任务上的精度。
摘要:
语言模型(LM)可以端到端地执行复杂推理,也可以组合执行,具有透明的中间状态。组合提供了可解释性和安全性的好处,但可能需要工作流支持和基础设施才能保持竞争力。本文提出迭代分解,一种人在环路的工作流,用于开发和改进组合语言模型程序。通过放大失败的组件并通过分解、附加上下文、思维链等来改进组合的性能。为支持这种工作流,本文开发了 ICE,一款开源工具,用于可视化语言模型程序的执行跟踪。将迭代分解应用于三个真实世界任务,并改进了语言模型程序在较少组合基线上的精度。这些应用程序作为工作流的案例研究,如果自动化,即使语言模型系统规模不断扩大到越来越复杂的任务,也能使其保持可解释性和安全性。
Language models (LMs) can perform complex reasoning either end-to-end, with hidden latent state, or compositionally, with transparent intermediate state. Composition offers benefits for interpretability and safety, but may need workflow support and infrastructure to remain competitive. We describe iterated decomposition, a human-in-the-loop workflow for developing and refining compositional LM programs. We improve the performance of compositions by zooming in on failing components and refining them through decomposition, additional context, chain of thought, etc. To support this workflow, we develop ICE, an open-source tool for visualizing the execution traces of LM programs. We apply iterated decomposition to three real-world tasks and improve the accuracy of LM programs over less compositional baselines: describing the placebo used in a randomized controlled trial (25% to 65%), evaluating participant adherence to a medical intervention (53% to 70%), and answering NLP questions on the Qasper dataset (38% to 69%). These applications serve as case studies for a workflow that, if automated, could keep ML systems interpretable and safe even as they scale to increasingly complex tasks.
https://arxiv.org/abs/2301.01751
2、[LG] StitchNet: Composing Neural Networks from Pre-Trained Fragments
S Teerapittayanon, M Comiter, B McDanel, H.T. Kung (2023)
StitchNet: 用预训练片段创建高性能神经网络
要点:
StitchNet 范式:一种通过组合多个预训练网络的片段来创建高性能神经网络的方法; 在评估片段的可组合性时,采用居中内核对齐(CKA)的新方法; 提出用于线性层和卷积层的可组合片段的组合技术。
一句话总结:
StitchNet 是通过结合多个预训练网络的片段来创建高性能神经网络的方法,用居中内核对齐(CKA)来评估兼容性并指导选择。StitchNets 可以在较少的计算和数据资源的情况下实现与传统训练网络相当的精度,并可用于即时个性化模型创建和推理效率的提升。
摘要:
提出了 StitchNet,一种新的神经网络创建范式,将来自多个预训练神经网络的片段(一个或多个连续网络层)拼接在一起。StitchNet 可以不需要传统模型创建过程中反向传播所需大量计算和数据要求而创建高性能神经网络。利用居中内核对齐(CKA)作为兼容性度量,有效地指导这些片段在组成针对特定精度需求和计算资源约束的给定任务的网络的选择。本文展示了这些片段可以被拼接在一起,以创建与传统训练网络相当精度的神经网络,而计算资源和数据要求的比例小得多。本文还探索了这种新范式激活的新型即时个性化模型创建和推理应用。
We propose StitchNet, a novel neural network creation paradigm that stitches together fragments (one or more consecutive network layers) from multiple pre-trained neural networks. StitchNet allows the creation of high-performing neural networks without the large compute and data requirements needed under traditional model creation processes via backpropagation training. We leverage Centered Kernel Alignment (CKA) as a compatibility measure to efficiently guide the selection of these fragments in composing a network for a given task tailored to specific accuracy needs and computing resource constraints. We then show that these fragments can be stitched together to create neural networks with comparable accuracy to traditionally trained networks at a fraction of computing resource and data requirements. Finally, we explore a novel on-the-fly personalized model creation and inference application enabled by this new paradigm.
https://arxiv.org/abs/2301.01947
3、[IR] InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval
V Jeronymo, L Bonifacio, H Abonizio, M Fadaee, R Lotufo, J Zavrel, R Nogueira
[NeuralMind & Zeta Alpha]
InPars-v2: 大型语言模型作为信息检索的高效数据集生成器
要点:
提出 InPars-v2,InPars的改进版,用于在信息检索任务中使用大型语言模型; InPars-v2 采用开源LLM和现有的强大重排名器,来选择合成查询-文档对进行训练; 与简单的 BM25 检索流和 monoT5 重排序器一起使用时,InPars-v2 在 BEIR 基准上取得了新的最先进的结果。
一句话总结:
InPars-v2 是一个开放可用的数据生成器,使用大型语言模型和重排序器,在 BEIR 基准测试中取得了新的最先进水平,合成数据和微调模型已公开发布供研究人员进一步改进。
摘要:
最近,InPars 引入了一种有效使用大型语言模型(LLM)的方法,用于信息检索任务:通过少量示例,引导 LLM 生成文档的相关查询。然后,这些合成的查询-文档对可用于训练检索器。然而,InPars 以及最近的 Promptagator 依赖专用的 LLM,如 GPT-3 和 FLAN,来生成这样的数据集。本文提出 InPars-v2,一个数据集生成器,使用开源 LLM 和现有的强大重排序器来选择用于训练的合成查询-文档对。在 InPars-v2 数据上微调的简单 BM25 检索流和 monoT5 重排序器在 BEIR 基准测试中取得了新的最佳水平。为了让研究人员进一步改进该方法,已将代码、合成数据和微调模型开源。
Recently, InPars introduced a method to efficiently use large language models (LLMs) in information retrieval tasks: via few-shot examples, an LLM is induced to generate relevant queries for documents. These synthetic query-document pairs can then be used to train a retriever. However, InPars and, more recently, Promptagator, rely on proprietary LLMs such as GPT-3 and FLAN to generate such datasets. In this work we introduce InPars-v2, a dataset generator that uses open-source LLMs and existing powerful rerankers to select synthetic query-document pairs for training. A simple BM25 retrieval pipeline followed by a monoT5 reranker finetuned on InPars-v2 data achieves new state-of-the-art results on the BEIR benchmark. To allow researchers to further improve our method, we open source the code, synthetic data, and finetuned models: this https URL
https://arxiv.org/abs/2301.01820
4、[CV] Rethinking Mobile Block for Efficient Neural Models
J Zhang, X Li, J Li, L Liu, Z Xue, B Zhang, Z Jiang, T Huang, Y Wang, C Wang
[Tencent & Peking University & Wuhan University]
重新思考高效神经模型的移动端块
要点:
本文专注于设计具有低参数低 FLOPs 的高效模型,用于稠密预测; 提出Meta-Mobile Block,一种统一 MobileNetv2 中高效逆残差块和 ViT 中有效Transformer的通用概念; 提出了用于移动和稠密应用的逆残差移动块(iRMB)和高效模块(EMO),基于Meta-Mobile Block概念,并在多个基准测试上实现了强大的性能。
一句话总结:
本文提出 Meta-Mobile Block 概念和 iRMB 和 EMO 模型,用于高效稠密预测,在多个基准测试中证明了其优于最先进的方法。
摘要:
本文致力于设计低参数低 FLOPs 的高效模型,用于稠密预测。尽管基于 CNN 的轻量方法在多年的研究后取得了令人瞩目的成果,但在准确性和受限资源之间权衡的模型仍需进一步改进。本研究重新思考了 MobileNetv2 中高效逆残差块和 ViT 中有效 Transformer 的基本统一性,通过归纳抽象出 Meta-Mobile Block 的一般概念,即使共享相同的框架,具体实例化也对模型性能非常重要。受这一现象的启发,本文推导出了一种简单而高效的现代逆残差移动块(iRMB),用于移动应用,其吸收了 CNN 的效率,用于模拟短程依赖关系,并具有 Transformer 类似的动态建模能力,用于学习长程交互。本文还设计了一种仅基于一系列 iRMB 的 ResNet 类 4 阶段高效模块(EMO),用于稠密应用。在 ImageNet-1K、COCO2017 和 ADE20K 基准测试中进行的大量实验证明了 EMO 优于最先进的方法,同时很好地平衡了模型准确性和效率。
This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern extbf{I}nverted extbf{R}esidual extbf{M}obile extbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase extbf{E}fficient extbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass extbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.
https://arxiv.org/abs/2301.01146
5、[CV] Teaching Computer Vision for Ecology
E Cole, S Stathatos, B Lütjens, T Sharma, J Kay, J Parham, B Kellenberger, S Beery
[Caltech & MIT & Wild Me & Yale University]
教生态学专家学计算机视觉
要点:
计算机视觉可以通过自动分析来自自动拍照相机、无人机和卫星等传感器的原始图像来加速生态学研究; 计算机视觉是一门新兴学科,生态学家接触和了解并不多; 通过一个密集的实践夏令营研讨,教授了一群多元化的生态学家计算机视觉系统的原型设计和评估; 确定了在教授计算机视觉给生态学家过程中遇到的挑战和改进的机会,并提出最佳实践。
一句话总结:
成功地通过一个密集的实践夏令营,教授了一群多元化的生态学家计算机视觉,在这个过程中确定了挑战和改进的机会。
摘要:
计算机视觉可以通过自动分析来自自动拍照相机、无人机和卫星等传感器的原始图像来加速生态学研究。但是,计算机视觉是一门新兴学科,生态学家接触和了解并不多。本文讨论了在一个密集的实践夏令营研讨中教授一群多元化的生态学家计算机视觉系统的原型设计和评估的经验。解释了夏令营研讨的结构,讨论常见挑战,并提出最佳实践。本文旨在面向教授跨学科计算机视觉的计算机科学家,但也可能对学习使用计算机视觉的生态学家或其他领域专家有用。
Computer vision can accelerate ecology research by automating the analysis of raw imagery from sensors like camera traps, drones, and satellites. However, computer vision is an emerging discipline that is rarely taught to ecologists. This work discusses our experience teaching a diverse group of ecologists to prototype and evaluate computer vision systems in the context of an intensive hands-on summer workshop. We explain the workshop structure, discuss common challenges, and propose best practices. This document is intended for computer scientists who teach computer vision across disciplines, but it may also be useful to ecologists or other domain experts who are learning to use computer vision themselves.
https://arxiv.org/abs/2301.02211