工劳快讯:汕尾美团骑手罢工取得阶段性胜利

记者调查泉州欣佳酒店倒塌曝惊人“案中案”:曾是卖淫场所,50名老板、官员卷入其中

退出中国市场的著名外企名单

去泰国看了一场“成人秀”,画面尴尬到让人窒息.....

【少儿禁】马建《亮出你的舌苔或空空荡荡》

生成图片,分享到微信朋友圈

自由微信安卓APP发布,立即下载! | 提交文章网址
查看原文

爱可可AI前沿推介(1.15)

爱可可爱生活 爱可可爱生活 2023-01-19

LG - 机器学习 CV - 计算机视觉 CL - 计算与语言

1、[LG] Unlocking de novo antibody design with generative artificial intelligence
2、[LG] Improved visualization of high-dimensional data using the distance-of-distance transformation
3、[CV] DensePose From WiFi
4、[CV] Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
5、[CV] Geometry-biased Transformers for Novel View Synthesis

摘要:用生成人式工智能解锁新抗体设计、用距离之距离变换改进高维数据可视化、基于WiFi的DensePose稠密姿态估计、卷积网络的BERT预训练方法设计、面向新视图合成的几何偏置Transformer

1、[LG] Unlocking de novo antibody design with generative artificial intelligence

A Shanehsazzadeh, S Bachas, G Kasun, J M. Sutton
[Absci Corporation]

用生成人式工智能解锁新抗体设计

要点:

  1. 用生成式深度学习模型以零样本方式设计了针对三种不同目标的抗体;
  2. 筛查了40万个抗体变体,发现三个结合比治疗抗体 trastuzumab 更紧密;
  3. 生成的序列具有高度的多样性,与已知抗体序列相似性低,具有可变的结构形态;
  4. 设计方法高度可控,可创建优化可开发性和免疫原性特征的蛋白质,减少下游可开发性风险。

一句话总结:
研究展示了用生成式人工智能在零样本模式下生成新抗体变体的能力,生成具有天然序列特征和高度多样性的结合体,有可能颠覆传统的抗体药物发现方法,节省大量时间和成本的同时,提供更可控的设计选项。

摘要:
生成人工智能(AI)有可能大大提高抗体设计的速度、质量和可控性。传统的新抗体发现,需要对大型免疫或合成库进行时间和资源密集型筛查。这些方法对输出序列的控制也很少,这可能导致具有次优绑定和不良可开发性属性的候选者胜出。有人引入了具有硅基证据前景的生成抗体设计模型,然而,该方法没有通过实验验证以证明新的抗体设计。本文采用生成式深度学习模型,以零样本方式针对三个不同目标设计抗体,所有设计都是一轮模型生成的结果,没有后续优化。特别的,筛选了40多万种旨在与人类表皮生长因子受体2(HER2)结合的抗体变体。用表面等离子体共振(SPR)进一步表示了421个粘合剂,发现三种比治疗性抗体曲妥珠单抗结合得更紧密的粘合剂。粘合剂高度多样化,对已知抗体的序列恒等性较低,并采用可变结构构象。这些结果为使用生成式人工智能和高通量实验为新治疗目标加速药物创造开辟了道路。

Generative artificial intelligence (AI) has the potential to greatly increase the speed, quality and controllability of antibody design. Traditional de novo antibody discovery requires time and resource intensive screening of large immune or synthetic libraries. These methods also offer little control over the output sequences, which can result in lead candidates with sub-optimal binding and poor developability attributes. Several groups have introduced models for generative antibody design with promising in silico evidence [1–10], however, no such method has demonstrated de novo antibody design with experimental validation. Here we use generative deep learning models to de novo design antibodies against three distinct targets, in a zero-shot fashion, where all designs are the result of a single round of model generations with no follow-up optimization. In particular, we screen over 400,000 antibody variants designed for binding to human epidermal growth factor receptor 2 (HER2) using our high-throughput wet lab capabilities. From these screens, we further characterize 421 binders using surface plasmon resonance (SPR), finding three that bind tighter than the therapeutic antibody trastuzumab. The binders are highly diverse, have low sequence identity to known antibodies, and adopt variable structural conformations. Additionally, these binders score highly on our previously introduced Naturalness metric [11], indicating they are likely to possess desirable developability profiles and low immunogenicity. We open source1 the HER2 binders and report the measured binding affinities. These results unlock a path to accelerated drug creation for novel therapeutic targets using generative AI combined with high-throughput experimentation.

https://www.biorxiv.org/content/10.1101/2023.01.08.523187v1



2、[LG] Improved visualization of high-dimensional data using the distance-of-distance transformation

J Liu ,M Vinck

用距离之距离变换改进高维数据可视化

要点:

  1. 当数据包含高维空间随机分散的噪点时,低维嵌入会出现“散射噪声问题”;
  2. 数据点之间相异度矩阵的距离之距离(DoD)变换能有效消除散射噪声的影响;
  3. 改进了几种高维数据集的低维嵌入,如自然图像的卷积神经网络表示或视觉刺激神经元群表示。

一句话总结:
提出一种更好的含噪高维数据可视化技术,用距离之距离变换来降低噪声并改善低维嵌入,有效消除散射噪声的影响。

摘要:
t-SNE和UMAP等降维工具被广泛用于高维数据分析。本文表明,当数据包括随机分散在高维空间中的噪声点时,噪声点与聚类点重叠的低维嵌入中会出现“散射噪声问题”。通过计算近邻距离间距离的距离矩阵进行简单变换即可缓解该问题,并将噪声点识别为单独的群。将该技术应用于高维神经元尖峰序列,以及卷积神经网络单元对自然图像的表示,发现构建的低维嵌入有所改进。本文进而提出一种改进的包含噪点的高维数据降维技术。

Dimensionality reduction tools like t-SNE and UMAP are widely used for high-dimensional data analysis. For instance, these tools are applied in biology to describe spiking patterns of neuronal populations or the genetic profiles of different cell types. Here, we show that when data include noise points that are randomly scattered within a high-dimensional space, a “scattering noise problem” occurs in the low-dimensional embedding where noise points overlap with the cluster points. We show that a simple transformation of the original distance matrix by computing a distance between neighbor distances alleviates this problem and identifies the noise points as a separate cluster. We apply this technique to high-dimensional neuronal spike sequences, as well as the representations of natural images by convolutional neural network units, and find an improvement in the constructed low-dimensional embedding. Thus, we present an improved dimensionality reduction technique for high-dimensional data containing noise points.

https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010764



3、[CV] DensePose From WiFi

J Geng, D Huang, F D l Torre
[CMU]

基于WiFi的DensePose稠密姿态估计

要点:

  1. 提出一种深度神经网络,将WiFi信号的相位和振幅映射到24个人体区域的UV坐标,以获得稠密的人体姿态对应;
  2. 结果表明,该模型可以仅用WiFi信号作为输入,估计多个目标的稠密姿态,与基于图像的方法相当;
  3. 该方法成本低、广泛可用,且具有隐私保护性,有可能使WiFi设备成为与RGB摄像机和激光雷达相比隐私性更好、不受照明影响、廉价的人体传感器。

一句话总结:
提出了一种深度神经网络,将WiFi信号映射到UV坐标,以进行稠密的人体姿态估计,其性能与基于图像的方法相当,有潜力成为RGB相机和激光雷达的低成本和更有隐私性的替代。

摘要:
计算机视觉和机器学习技术的进步导致了来自RGB相机、激光雷达和雷达的2D和3D人体姿态估计的重大发展。然而,人类对图像的姿态估计受到遮挡和照明的不利影响,这在许多目标场景都很常见。另一方面,雷达和激光雷达技术需要昂贵且耗电的专用硬件。此外,将这些传感器放置在非公共区域会引起严重的隐私问题。为了解决这些局限性,最近的研究探索了使用WiFi天线(1D传感器)进行身体分割和关键点身体检测。本文进一步扩展了WiFi信号与计算机视觉中常用的深度学习架构相结合的使用,以估计稠密的人类姿态对应。本文提出一种深度神经网络,将WiFi信号的相位和振幅映射到24个人体区域的UV坐标。研究结果表明,该模型可通过用WiFi信号作为唯一的输入来估计多个目标人的稠密姿态,其性能与基于图像的方法相当。这为人体传感的低成本、可广泛访问和隐私保护铺平了道路。

Advances in computer vision and machine learning techniques have led to significant development in 2D and 3D human pose estimation from RGB cameras, LiDAR, and radars. However, human pose estimation from images is adversely affected by occlusion and lighting, which are common in many scenarios of interest. Radar and LiDAR technologies, on the other hand, need specialized hardware that is expensive and power-intensive. Furthermore, placing these sensors in non-public areas raises significant privacy concerns. To address these limitations, recent research has explored the use of WiFi antennas (1D sensors) for body segmentation and key-point body detection. This paper further expands on the use of the WiFi signal in combination with deep learning architectures, commonly used in computer vision, to estimate dense human pose correspondence. We developed a deep neural network that maps the phase and amplitude of WiFi signals to UV coordinates within 24 human regions. The results of the study reveal that our model can estimate the dense pose of multiple subjects, with comparable performance to image-based approaches, by utilizing WiFi signals as the only input. This paves the way for low-cost, broadly accessible, and privacy-preserving algorithms for human sensing.

https://arxiv.org/abs/2301.00250



4、[CV] Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling

K Tian, Y Jiang, Q Diao, C Lin, L Wang, Z Yuan
[Peking University & Bytedance Inc & University of Oxford]

卷积网络的BERT预训练方法设计:稀疏和分层掩码建模

要点:

  1. 提出可以直接在任意卷积网络上使用、无需修改骨干的BERT风格预训练方法,克服其无法处理不规则掩码输入的问题;
  2. 对卷积网络生成式预训练的设计洞察:掩码图像建模中稀疏卷积的使用,以及BERT风格预训练的分层设计;
  3. 在下游任务上卷积网络性能的大幅提高(高达+3.5分),表明了将 Transformer 的预训练-微调范式扩展到卷积网络的前景。

一句话总结:
SparK是一种BERT风格的预训练方法,可以直接应用于任意卷积网络,用稀疏卷积来处理不规则的掩码输入图像,用分层解码器来利用卷积网络的分层结构,能显著提高下游任务的性能。

摘要:
本文分析并克服了将BERT风格预训练或掩码图像建模的成功扩展到卷积网络(convnet)的两个关键障碍:(i) 卷积操作无法处理不规则的随机掩码输入图像;(ii) BERT 预训练的单尺度性质与 convnet 的层次结构不一致。对于(i),将未掩码的像素视为3D点云的稀疏体素,用稀疏卷积进行编码。这是首次使用稀疏卷积进行二维掩码建模。对于(ii),本文提出一个分层解码器,以从多尺度编码特征重建图像。称为Sparse masKed建模(SparK)的方法很通用:可以直接用于任意卷积模型,而无需进行骨干修改。在经典(ResNet)和现代(ConvNeXt)模型上验证了它:在三个下游任务中,它以类似的大幅度(约+1.0%)超越了最先进的对比学习和基于 Transformer 的掩码建模。目标检测和实例分割的改进更加显著(高达+3.5%),验证了所学特征的强大可迁移性。本文还通过在更大模型上观察到更多收益,发现其有利的缩放行为。所有这些证据都揭示了在convnets上进行生成式预训练的有希望的未来。

We identify and overcome two key obstacles in extending the success of BERT-style pre-training, or the masked image modeling, to convolutional networks (convnets): (i) convolution operation cannot handle irregular, random-masked input images; (ii) the single-scale nature of BERT pre-training is inconsistent with convnet's hierarchical structure. For (i), we treat unmasked pixels as sparse voxels of 3D point clouds and use sparse convolution to encode. This is the first use of sparse convolution for 2D masked modeling. For (ii), we develop a hierarchical decoder to reconstruct images from multi-scale encoded features. Our method called Sparse masKed modeling (SparK) is general: it can be used directly on any convolutional model without backbone modifications. We validate it on both classical (ResNet) and modern (ConvNeXt) models: on three downstream tasks, it surpasses both state-of-the-art contrastive learning and transformer-based masked modeling by similarly large margins (around +1.0%). Improvements on object detection and instance segmentation are more substantial (up to +3.5%), verifying the strong transferability of features learned. We also find its favorable scaling behavior by observing more gains on larger models. All this evidence reveals a promising future of generative pre-training on convnets.

https://arxiv.org/abs/2301.03580



5、[CV] Geometry-biased Transformers for Novel View Synthesis

N Venkat, M Agarwal, M Singh, S Tulsiani
[CMU]

面向新视图合成的几何偏置Transformer

要点:

  1. 提出"几何偏置Transformer"(GBT)用于由几张输入图像和相关相机视点合成目标的新视图;
  2. 在基于集合潜表示的推断中纳入几何归纳偏差,以鼓励多视图几何一致性;
  3. 在 Transformer 层的注意力机制中使用基于射线距离的偏差,以帮助引导场景编码和射线解码阶段关注相关上下文,实现更精确的视图合成。

一句话总结:
提出“几何偏置Transformer”(GBT),通过在基于集合潜表示的推理中纳入几何归纳偏差,合成目标的新视图,以鼓励多视图几何一致性,在Co3D数据集上泛化到没见过的对象类别方面优于其他方法。

摘要:
本文研究的是为给定少数输入图像和相关相机观点合成目标新视图的任务。其灵感来自最近的“非几何”方法,其中多视图图像被编码为(全局)集潜表示,用于预测任意查询射线的颜色。虽然这种表示产生与新视图相对应的(粗略)准确的图像,但缺乏几何推理限制了这些输出的质量。为克服这一局限性,本文提出“几何偏置Transformer”(GBT),在基于集合潜表示的推断中纳入几何归纳偏差,以鼓励多视图几何一致性。通过增强点积注意力机制来归纳几何偏差,将与 token 相关的光线间的3D距离作为可学习的偏差,加上相机感知嵌入作为输入,使模型能产生更准确的输出。在现实世界CO3D数据集上验证了该方法,训练了10多个类别的系统,评估了其对新对象和没见过类别的视图合成能力。

We tackle the task of synthesizing novel views of an object given a few input images and associated camera viewpoints. Our work is inspired by recent 'geometry-free' approaches where multi-view images are encoded as a (global) set-latent representation, which is then used to predict the color for arbitrary query rays. While this representation yields (coarsely) accurate images corresponding to novel viewpoints, the lack of geometric reasoning limits the quality of these outputs. To overcome this limitation, we propose 'Geometry-biased Transformers' (GBTs) that incorporate geometric inductive biases in the set-latent representation-based inference to encourage multi-view geometric consistency. We induce the geometric bias by augmenting the dot-product attention mechanism to also incorporate 3D distances between rays associated with tokens as a learnable bias. We find that this, along with camera-aware embeddings as input, allows our models to generate significantly more accurate outputs. We validate our approach on the real-world CO3D dataset, where we train our system over 10 categories and evaluate its view-synthesis ability for novel objects as well as unseen categories. We empirically validate the benefits of the proposed geometric biases and show that our approach significantly improves over prior works.

https://arxiv.org/abs/2301.04650




文章有问题?点此查看未经处理的缓存