查看原文
其他

DeePMD-kit v2.0.0:AI+分子模拟的新起点

DeePMD-kit开发者 深度势能 2022-07-07


今天,DP团队正式发布DeepModeling社区备受关注的软件DeePMD-kit的新版本v2.0.0。


本次大版本更新是对DeePMD-kit的一次系统升级。作为AI+分子模拟领域的领跑者,DeePMD-kit v2.0.0将以开源的方式,为社区开发者和用户提供更灵活的硬件支持、更高效的训练推断、更丰富的模型架构


同时,在DeepModeling社区内,DeePMD-kit正在与dpgen、dpdata、dpti、dpdispathcer、dpgui、dargs等多个项目形成一套开源软件矩阵,推动第一性原理精度的分子动力学解决更多来自物理、化学、材料、生物、地质等各个领域原子尺度的问题。


图源:

https://github.com/deepmodeling/deepmd-kit/releases


以下我们简要介绍DeePMD-kit v2.0.0的新特点及新功能。更多信息见GitHub(https://github.com/deepmodeling/deepmd-kit)及Gitee(https://gitee.com/deepmodeling/deepmd-kit)



扫描阅读

DeePMD-kit v2.0.0 文档



对于更多关于软件使用和交流的问题,可以关注本公众号、联系小编。小编会将大家拉到相应的交流群。欢迎大家用起来!


更灵活的硬件支持

继去年在Summit超算上优化的结果获得戈登贝尔奖后,DP开发者们在硬件支持方面没有停下前进的脚步。v2.0.0的架构调整也让多硬件平台支持成为了一件容易的事情。


DeePMD-kit v1.x 版本仅支持CUDA编程平台的GPU硬件。对于很多国内用户,由于数个国产高性能计算平台使用的是ROCm平台上的GPU加速部件,在这些超算上是无法使用DeePMD-kit的。为了解决这个痛点问题,我们在DeePMD-kit v2.0.0对AMD的ROCm平台进行了支持和优化[1]。这项功能对用户是透明的,即用户无需对输入脚本和执行命令进行任何修改就可以在ROCm平台上运行。


除ROCm平台支持外,DeePMD-kit在ARM CPU等硬件平台的支持也将在测试稳定后于后续版本中陆续发布。在此基础上,处在底层硬件和上层软件之间的机器学习框架也成为DP社区探索的重点。近期,DP团队也在积极与百度PaddlePaddle深度学习平台探索适合AI+分子模拟的框架[2]


更高效的训练推断

高效、方便的训练(通过拟合能量、受力等电子结构数据来优化DP模型参数)和推断(利用训练好的DP模型进行分子动力学模拟)一直是DP开发者的追求。


DeePMD-kit v1.x 版本中,虽然推断任务(例如分子动力学模拟等)已经能在上千块GPU上并行执行,但是训练任务只能在一个GPU加速卡上进行。典型的训练任务需要的时间在1-7天,还有不少提升空间。在DeePMD-kit v2.0.0中,我们引入用户期盼已久的并行训练支持[3],在一次训练中能够同时使用多个GPU进行,在数据 batch 足够大的前提下,并行训练几乎可以完美扩展。DeePMD-kit并行训练的设计十分用户友好,在不改变原先输入参数的前提下,只用对执行命令的方式稍作修改即可达到并行训练,例如:

CUDA_VISIBLE_DEVICES=4,5,6,7 horovodrun -np 4 \    dp train --mpi-log=workers input.json

除持续进行的高性能优化外,我们着重推出了“一键操作、十倍加速”的模型压缩功能,在不失精度的前提下大大加快了DeePMD大规模推断的效率。进一步,DeePMD-kit v2.0.0也将支持Compressed Training[4],即在模型压缩基础上的进一步训练。相关功能的具体实现和使用方式将在本公众号后续推送中给出。


DeePMD-kit v2.0.0的基础上,DeepModeling社区成员也正在将DeePMD-kit 接入到越来越多的分子动力学软件中。除DP已稳定支持的LAMMPS、i-pi等软件外,对OpenMM、Gromacs的支持也将陆续与大家见面。各分子模拟软件的用户将不必重复造轮子,可以直接使用DP模型和他们惯用的计算脚本来算他们感兴趣的性质。


更丰富的模型架构

DeePMD-kit v2.0.0 最丰富的更新和调整,还是在于对模型架构的支持。例如,对于偶极、极化率等张量型物理量的训练和推断DeePMD-kit v2.0.0 进行了系统重构,使得用户无需对算法有深入了解即可直接使用。这项功能使得若干光谱计算成为可能,例如IR光谱[5]和Raman光谱[6]


模型架构的支持也让我们得以有效尝试各种新的模型构造和参数设置。例如,在DeePMD-kit v2.0.0中,我们引入了三体嵌入这一新型描述子[7]DeePMD-kit v1.x 的描述子仅支持两体嵌入,虽然在环境矩阵中已经包括了构型多体相关性的的描述,但是这个描述仍然是不完备的。在绝大多数应用中这种不完备性不会导致模型精度不足,但我们仍然发现在少数极端情况下两体嵌入的深度势能模型无法满足用户需求,因此在DeePMD-kit v2.0.0中加入了三体嵌入,更加完备地描述每个原子的局域构型环境,在极端情形下本质性地提高了模型精度,解决了用户的痛点问题。


此外,我们还增加了原子类型嵌入功能[8]。深度势能模型中深度神经网络的数目是和体系中原子类型的数目一致的。更精确地说,针对每种原子需要至少训练一个嵌入网络和一个逼近网络。这导致在元素类型众多的体系中需要训练的网络的个数很大,降低了训练、推断效率,且使数据利用率变低。为了解决这一用户痛点,DeePMD-kit v2.0.0 加入了原子类型嵌入功能,通过将原子类型嵌入特征空间,将嵌入网络和逼近网络对原子类型的依赖解放出来,达到仅需一个嵌入网络和一个逼近网络即可实现对多组元体系建模的目的。


结语

我们诚挚地感谢过去4年来 DeePMD-kit 的每一位开发者、使用者、支持者,分子模拟的未来离不开所有人的付出。以下是我们的 Release note 和需要感谢的贡献者们。


Release note DeePMD-kit v2.0.0

Breaking changes to v1.3

  • Training parameters: Several training parameters have been updated. Original training data is splited into training data and validation data. Please read the document to apply the changes. Old styles can still work but are not recommended.
  • Model inference: Old models trained by v1 will not work in v2. Run dp convert-from to convert old models to v2.
  • Python interface: deepmd.DeepPot has been moved to deepmd.infer.DeepPot.
  • C++ interface: NNPInter has been renamed to deepmd::DeepPot and NNPInter.h has been renamed to DeepPot.h. Use -ldeepmd_cc to link instead.
New features
  • Model compression (#350 #586 #610 #921 #948 #956 #1000 #1008 #1020  #1043)
  • Parallel training (#892 #905 #913 #1030 #1032) (Bytedance)
  • ROCm device support (#656 )
  • New descriptor: three body embedding (se_e3)
  • Hybridization of descriptors (hybrid)
  • Type embedding
  • Training and inference the dipole (vector) and polarizability (matrix). (#495 #538 #927)
  • Support derivatives of the tensor properties. (#805)
  • Split of training and validation dataset.
  • Model deviation for virial
  • Add subcommand and python interface to calculate model-deviation (#715)
  • Automatically determine the sel from the training data. (#831)
  • Building with lammps with plugin mode (#930 #945)
Performance improvement:
  • More efficient training: all customized OPs are implemented with GPU.
  • MPI support for atomic model deviation #628
  • speedup ROCm kernels which use atomicAdd (#809 #815 ) (from ByteDance)
  • speedup CUDA kernels (use atomicAdd inside) by reducing the global memory write (#811)
  • speedup tabulate cuda kernel by reducing shm using (#830) (Bytedance)
  • speedup format_nlist_b (#832 #845)
  • speedup scan_nlist kernel (#1028)
Enhancements
  • Strict argument check in the input script.
  • Auto conversion of input file to v2.0 compatibility
  • Append out_file when lammps restarts #640
  • Document and examples for the C++ interface #652 #663
  • Instructions for the i-pi #660
  • Document for the network size and sel #657
  • Use fmod to wrap the coord of atoms (solve slow PBC) (#741)
  • bit operations to encode neighbor information
  • add CUDA/ROCM buidling documents (#739)
  • add type-embedding developer doc (#762 #967)
  • add model compression support for models with exclude_types feature (#754)
  • improve the doc and user interface of model compression (#772)
  • support converting models generated in v1.3 to 2.0 compatibility (#725)
  • give a default value to T and convert models from v1.2 to 2.0 compatibility (#789)
  • improved documents for conda (#798 #925)
  • throw a message if tf runtime is incompatible (#797)
  • capture OOM and print debug message (#801)
  • add message for DecodeError raised when using model compression (#839)
  • Passing error to TF instead of exit (#918)
  • refactor docs (#952)
  • add an example of nopbc and related docs (#994)
  • add dp --version (#995)
  • add the argument tensorboard_freq to control sampling ratio during training. (#996)
  • add sphinx plugins viewcode and intersphinx (#997)
  • generate Python API document automatically (#998)
  • give a clear message if model.get_ntypes()<data.get_ntypes() (#1016)
  • add docstring for descrpt/se_e2_a (#1017)
  • add docstring for fit/ener (#1024)
  • add InputNlist into API doc (#1009)
  • save checkpoint files with step and keep recent files (#1031)
Improvement of the code for developers
  • Support version of the model. Easily check model compatability
  • Clear and pythonic python interface
  • C++ lib that can be tested independently
  • C++ API that can be tested independently
  • OP supports multi-device.
  • Added deepmd namespace for the C++ API
  • UT for Cuda/ROCm code (#569)
  • UT for model compression (#586)
  • UT for prod_force/virial ops (#703 #741)
  • CI test Lammps build (#600)
  • allow c++ tests to run without internet (#785)
  • build low and high precision at the same time (#879)
  • support to specify CUDA/ROCm root in python pkg building (#834) (Bytedance)
  • use cached Session to speed up py tests (#833)
  • remove cub include for CUDA>=11 (#866)
  • Add Errcheck after every kernel function runs And merge redundant code (#855)
  • adapt changes to auditwheel directory in manylinux (#889)
  • enhance the cli to generate doc json file (#891)
  • raise warning before training if sel is not enough (#914)
  • make native MD compatible with v2.0 (#950)
  • fix type hints and add doc for exclude_types (#1005)
  • use TF's built-in method to get numpy dtype (#1035)
Bug fixings:
  • Remove using namespace std. Solve compiling compatability problem.
  • cuda memory access error #566
  • Relative force model deviation is not copied back at single precision #599
  • Correct way of allocating memory in float precision #612
  • Fix TB logdir remove bug #617
  • Illegal nlist #680
  • Bug in prod_virial_grad that causes wrong results when training with virials #685
  • Uniform random seed #691
  • Illegal nlist #680
  • Bug in prod_virial_grad that causes wrong results when training with virials #685
  • Uniform random seed #691
  • fix bug of adding int to a None random seed (#705)
  • reuse the zero layer rather than building a new one (#714)
  • fix bug in CI (#739)
  • fix bug 824 and Synchronize updates to CUDA cod (#828)
  • Fix the empty neighbor distance array in neighbor_stat.py (#882)
  • fix InvalidArgumentError caused by zero sel and optimize zero matrix (#900)
  • fix 'NoneType' has no len() in auto_sel (#911)
  • set input DeepmdData.type_map to input type_map (#924)
  • Fix member declartion of deepmd and deepmd.entrypoints. (#922)
  • add aliases to Arguments (#933)
  • fix bug of gelu activation function (#939)
  • convert decay_rate to stop_lr from old inputs (#949)
  • only enable link what you use on GNU compilers (#962)
  • Do not find protobuf for python (#963)
  • fix an error in stress by ase interface (#964)
  • remove bare except and limit the try clause (#977)
  • fix python cmake error (#976)
  • Instantiate RunOptions first when training. (#1019)
  • Fix complier type in cmake: CMAKE_COMPILER_IS_GNUCXX (#1038)
  • other cleanups of the code (#968 #970 #975 #999 #1004 #1002 #1001 #1010 #1014 #1012 #1011 #1021 #1036 #1037)

Contributors

  • Han Bao

  • Roberto Car

  • Junhan Chang

  • Yixiao Chen

  • Ye Ding

  • Weinan E

  • Jiequn Han

  • Li’ang Huang

  • Weile Jia

  • Zeyu Li

  • Ziyao Li

  • Yinnian Lin

  • Yihao Liu

  • Xinzijian Liu

  • Denghui Lu

  • Marián Rynik

  • Shaochen Shi

  • Ping Tuo

  • Bo Wang

  • Haidi Wang

  • Han Wang

  • Yingze Wang

  • Yu Xia

  • Fengbo Yuan

  • Jiabin Yang

  • Haotian Ye

  • Jinzhe Zeng

  • Duo Zhang

  • Linfeng Zhang

  • Yuzhi Zhang


上下滑动查看更多



Reference

[1] https://github.com/deepmodeling/deepmd-kit/pull/656

[2] https://github.com/deepmodeling/deepmd-kit/tree/paddle

[3] https://github.com/deepmodeling/deepmd-kit/pull/892

[4] https://github.com/deepmodeling/deepmd-kit/pull/1000

[5] Zhang, L., Chen, M., Wu, X., Wang, H., E, W., & Car, R. (2020). Deep neural network for the dielectric response of insulators. Physical Review B, 102(4), 041121.

[6] Sommers, G. M., Andrade, M. F.0 C., Zhang, L., Wang, H., & Car, R. (2020). Raman spectrum and polarizability of liquid water from deep neural networks. Physical Chemistry Chemical Physics, 22(19), 10592-10602.

[7] https://docs.deepmodeling.org/projects/deepmd/en/master/model/train-se-e3.html?highlight=se_e3#descriptor-se-e3

[8] https://docs.deepmodeling.org/projects/deepmd/en/master/development/type-embedding.html?highlight=Embedding


上下滑动查看更多


您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存