别追问旅行的意义,去主动赋予

女律师台上传授“拿捏”法官神技:穿裙子开庭,把声音夹起来,嗲起来

医疗反腐,风向带偏了

【漏洞预警-已验证】企业微信敏感信息泄露漏洞

穷就算了,为啥我们还越来越忙?

生成图片,分享到微信朋友圈

自由微信安卓APP发布,立即下载! | 提交文章网址
查看原文

爱可可AI前沿推介(1.1)

爱可可爱生活 爱可可爱生活 2023-01-04

LG - 机器学习 CV - 计算机视觉 CL - 计算与语言

1、[LG] Structure-based drug discovery with deep learning
2、[CL] GENIE: Large Scale Pre-training for Text Generation with Diffusion Model
3、[LG] Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality
4、[CL] FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference
5、[CV] Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting

摘要:基于结构的药物发现与深度学习、基于扩散模型的文本生成大规模预训练、悲观和推广的经验伯恩斯坦不等式、面向更强性能和更快推理的解码器内融合优化、文本引导图像补齐的推进与评估

1、[LG] Structure-based drug discovery with deep learning

R Özçelik, D v Tilborg, J Jiménez-Luna, F Grisoni
[Eindhoven University of Technology & Microsoft Research Cambridge]

基于结构的药物发现与深度学习

要点:

  1. 深度学习可用于蛋白质结构和分子生物活性预测,有机合成规划,以及新分子设计;
  2. 深度学习方法的进步和蛋白质三级结构的准确预测,为AI引领的基于结构的药物发现提供了新的复兴机会;
  3. 深度学习为更有效地探索化学空间提供了新的机会,如配体结合位点检测,药物靶点相互作用预测和基于结构的从头设计。

摘要:
以深度学习为主要形式的人工智能(AI)为药物发现和化学生物学带来了希望,例如预测蛋白质结构和分子生物活性,规划有机合成,以及设计新分子。虽然药物发现中的大多数深度学习工作都集中在基于配体的方法上,但基于结构的药物发现有可能应对未解决的挑战,例如未探索蛋白质靶点的亲和力预测、结合机制阐明以及相关化学动力学特性的合理化。深度学习方法的进步和蛋白质三级结构准确预测的可用性倡导在人工智能指导下基于结构的药物发现方法的复兴。本综述总结了基于结构的药物发现深度学习中最突出的算法概念,并预测了未来的机会、应用和挑战。

Artificial intelligence (AI) in the form of deep learning bears promise for drug discovery and chemical biology, e.g., to predict protein structure and molecular bioactivity, plan organic synthesis, and design molecules de novo. While most of the deep learning efforts in drug discovery have focused on ligand-based approaches, structure-based drug discovery has the potential to tackle unsolved challenges, such as affinity prediction for unexplored protein targets, binding-mechanism elucidation, and the rationalization of related chemical kinetic properties. Advances in deep learning methodologies and the availability of accurate predictions for protein tertiary structure advocate for a renaissance in structure-based approaches for drug discovery guided by AI. This review summarizes the most prominent algorithmic concepts in structure-based deep learning for drug discovery, and forecasts opportunities, applications, and challenges ahead.

https://arxiv.org/abs/2212.13295



2、[CL] GENIE: Large Scale Pre-training for Text Generation with Diffusion Model

Z Lin, Y Gong, Y Shen, T Wu, Z Fan...
[Microsoft Research Asia & Xiamen University & Tsinghua University & Fudan University]

GENIE: 基于扩散模型的文本生成大规模预训练

要点:

  1. 提出一种大规模预训练序列到序列文本生成模型GENIE,结合了Transformer和扩散模型;
  2. 提出一种连续段落去噪的预训练方法,可以在大规模语料上预训练GENIE;
  3. 实验结果表明,GENIE能生成高质量、多样化的文本,证明了对GENIE的大规模预训练的有效性。

摘要:
本文提出一种名为GENIE的基于扩散模型的大规模文本生成语言预训练。GENIE是一个预训练的序列到序列文本生成模型,结合了Transformer和扩散。扩散模型接收编码器的潜信息,用于指导当前时间步的去噪。在多次此类去噪迭代后,扩散模型可以将高斯噪声恢复到由输入文本控制的多样化输出文本。此外,这种架构设计还允许对GENIE进行大规模的预训练。根据扩散模型的特点,本文提出一种名为连续段落去噪的新的预训练方法。在XSum、CNN/DailyMail和Gigaword基准上的广泛实验表明,GENIE可以实现具有各种强基线的可比性能,特别是在预训练后,GENIE的生成质量大大提高。

In this paper, we propose a large-scale language pre-training for text GENeration using dIffusion modEl, which is named GENIE. GENIE is a pre-training sequence-to-sequence text generation model which combines Transformer and diffusion. The diffusion model accepts the latent information from the encoder, which is used to guide the denoising of the current time step. After multiple such denoise iterations, the diffusion model can restore the Gaussian noise to the diverse output text which is controlled by the input text. Moreover, such architecture design also allows us to adopt large scale pre-training on the GENIE. We propose a novel pre-training method named continuous paragraph denoise based on the characteristics of the diffusion model. Extensive experiments on the XSum, CNN/DailyMail, and Gigaword benchmarks shows that GENIE can achieves comparable performance with various strong baselines, especially after pre-training, the generation quality of GENIE is greatly improved. We have also conduct a lot of experiments on the generation diversity and parameter impact of GENIE. The code for GENIE will be made publicly available.

https://arxiv.org/abs/2212.11685



3、[LG] Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality

Y Jin, Z Ren, Z Yang, Z Wang
[Stanford University & University of Chicago & Yale University & Northwestern University]

“没有重叠”的策略学习:悲观和推广的经验伯恩斯坦不等式

要点:

  1. 算法优化策略值的较低置信界(LCB),而不假定任何统一的重叠条件;
  2. 建立了一个数据相关的最大次优上界,只取决于最优策略的重叠程度和优化策略类的复杂度;
  3. 开发了一种新的自标准化类型浓度不等式,用于反比例加权估计,将众所周知的经验伯恩斯坦不等式推广到无界和非独立同分布的数据。

摘要:本文研究离线策略学习,旨在利用先验收集的观察结果(来自固定或适应性演化的行为策略)来学习最佳的个性化决策规则,为特定种群实现最佳的总体结果。现有的策略学习方法依赖于一个统一的重叠假设,即探索所有单个特征的所有操作的倾向在离线数据集中是有下界的;换句话说,现有方法的性能取决于离线数据集中的最坏情况倾向。由于无法控制数据收集过程,因此在许多情况下,这种假设可能是不现实的,特别是当行为策略随着时间的推移随着某些行动倾向的降低而演化时。本文提出一种优化策略值的较低置信界(LCB)以取代点估计的新算法。LCB用收集离线数据的行为策略知识构建。在不假设任何统一重叠条件的情况下,为算法的次优性建立了一个与数据相关的上界,这仅取决于 (i) 最优策略的重叠,以及 (ii) 优化策略类的复杂性。这意味着,对于适应性收集的数据,只要最佳行动的倾向随着时间的推移而降低,而次优行动的倾向被允许任意快速减少,就能确保高效的策略学习。在本文提供的理论分析中,为逆倾向加权估计器开发了一种新的自归一化类型浓度不等式,将众所周知的经验伯恩斯坦不等式推广到无界和非独立同分布的数据。

This paper studies offline policy learning, which aims at utilizing observations collected a priori (from either fixed or adaptively evolving behavior policies) to learn an optimal individualized decision rule that achieves the best overall outcomes for a given population. Existing policy learning methods rely on a uniform overlap assumption, i.e., the propensities of exploring all actions for all individual characteristics are lower bounded in the offline dataset; put differently, the performance of the existing methods depends on the worst-case propensity in the offline dataset. As one has no control over the data collection process, this assumption can be unrealistic in many situations, especially when the behavior policies are allowed to evolve over time with diminishing propensities for certain actions. In this paper, we propose a new algorithm that optimizes lower confidence bounds (LCBs) -- instead of point estimates -- of the policy values. The LCBs are constructed using knowledge of the behavior policies for collecting the offline data. Without assuming any uniform overlap condition, we establish a data-dependent upper bound for the suboptimality of our algorithm, which only depends on (i) the overlap for the optimal policy, and (ii) the complexity of the policy class we optimize over. As an implication, for adaptively collected data, we ensure efficient policy learning as long as the propensities for optimal actions are lower bounded over time, while those for suboptimal ones are allowed to diminish arbitrarily fast. In our theoretical analysis, we develop a new self-normalized type concentration inequality for inverse-propensity-weighting estimators, generalizing the well-known empirical Bernstein's inequality to unbounded and non-i.i.d. data.

https://arxiv.org/abs/2212.09900



4、[CL] FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference

M d Jong, Y Zemlyanskiy, J Ainslie, N FitzGerald, S Sanghai...
[Google Research & University of Southern California]

FiDO: 面向更强性能和更快推理的解码器内融合优化

要点:

  1. 解码器内融合(FiD)是一种功能强大的检索增强语言模型,在许多知识稠密的NLP任务上达到了最佳指标;
  2. 推理时间的大部分是由解码器中的内存带宽约束造成的,本文提出两种简单的FiD架构改进,可将推理加速7倍;
  3. 提出FiDO,一种FiD的扩展,移除了大部分交叉注意力层并采用多查询注意力来大大降低解码器的成本。

摘要:
解码器内融合(FiD)是一个强大的检索增强语言模型,在许多知识密集型NLP任务上达到了最佳指标。然而,FiD受到非常昂贵的推理的影响。大多数推理时间是由解码器中的内存带宽约束引起的,并提议对FiD架构进行两个简单的更改,以将推断速度提高7倍。更快的解码器推断允许使用更大的解码器。将上述修改的FiD表示为FiDO,并表明它比现有的FiD模型更能提高性能,用于广泛的推理预算。例如,FiDO-Large-XXL比FiD-Base执行更快的推理,并且比FiD-Large性能更好。

Fusion-in-Decoder (FiD) is a powerful retrieval-augmented language model that sets the state-of-the-art on many knowledge-intensive NLP tasks. However, FiD suffers from very expensive inference. We show that the majority of inference time results from memory bandwidth constraints in the decoder, and propose two simple changes to the FiD architecture to speed up inference by 7x. The faster decoder inference then allows for a much larger decoder. We denote FiD with the above modifications as FiDO, and show that it strongly improves performance over existing FiD models for a wide range of inference budgets. For example, FiDO-Large-XXL performs faster inference than FiD-Base and achieves better performance than FiD-Large.

https://arxiv.org/abs/2212.08153



5、[CV] Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting

S Wang, C Saharia, C Montgomery, J Pont-Tuset, S Noy, S Pellegrini, Y Onoe, S Laszlo, D J. Fleet, R Soricut...
[Google Research]

Imagen编辑器和EditBench:文本引导图像补齐的推进与评估

要点:

  1. Imagen Editor是一种级联扩散模型,可在文本引导图像补全上微调,用目标检测器在训练期间提出补全掩码;
  2. EditBench是一种系统的文本引导图像补全基准,可对自然图像和生成图像的补全编辑进行细粒度评估,探索对象、属性和场景;
  3. EditBench上的人工评估表明,训练期间的目标掩码可以改善文本图像对齐,当前模型比文本渲染更擅长对象渲染。

摘要:
文本引导图像编辑可在支持创意应用方面产生变革性影响。一个关键的挑战是生成忠实于输入文本提示的编辑,同时与输入图像保持一致。本文提出Imagen编辑器,一种通过在文本引导图像补全上微调Imagen构建的级联扩散模型。Imagen编辑器的编辑忠实于文本提示,这是通过在训练期间使用目标检测器提出补全掩码来完成的。此外,图像编辑器通过调节原始高分辨率图像上的级联管道来捕获输入图像中的精细细节。为了改进定性和定量评估,引入了EditBench,文本引导图像补全的系统基准。EditBench评估自然和生成图像的补全编辑,探索对象、属性和场景。通过对EditBench的广泛人工评估,发现训练期间的目标掩码导致文本图像对齐的全面改进——例如,图像编辑器优于DALL-E 2和Stable Diffusion——作为一个队列,这些模型更擅长目标渲染而不是文本渲染,并且比计数/形状属性更好地处理材料/颜色/大小属性。

Text-guided image editing can have a transformative impact in supporting creative applications. A key challenge is to generate edits that are faithful to input text prompts, while consistent with input images. We present Imagen Editor, a cascaded diffusion model built, by fine-tuning Imagen on text-guided image inpainting. Imagen Editor's edits are faithful to the text prompts, which is accomplished by using object detectors to propose inpainting masks during training. In addition, Imagen Editor captures fine details in the input image by conditioning the cascaded pipeline on the original high resolution image. To improve qualitative and quantitative evaluation, we introduce EditBench, a systematic benchmark for text-guided image inpainting. EditBench evaluates inpainting edits on natural and generated images exploring objects, attributes, and scenes. Through extensive human evaluation on EditBench, we find that object-masking during training leads to across-the-board improvements in text-image alignment -- such that Imagen Editor is preferred over DALL-E 2 and Stable Diffusion -- and, as a cohort, these models are better at object-rendering than text-rendering, and handle material/color/size attributes better than count/shape attributes.

https://arxiv.org/abs/2212.06909




文章有问题?点此查看未经处理的缓存