本文分享阿里妈妈技术创意&视频平台关于图像驱动的最新研究工作。该项工作论文已发表在CVPR 2022,基于本文成果的图片生成动效视频可用于广告视频创意生成,已产出Demo,发表于去年的ACM MM Demo Track。论文:Structure-Aware Motion Transfer with Deformable Anchor Model下载:https://arxiv.org/abs/2204.05018
图像驱动可以很容易用在动效视频生成相关的泛娱乐化场景中。例如风靡一时“吗咿呀嘿”应用, 输入自己的头像,就能加入到一组非常魔性的“吗咿呀嘿”合唱团中。又如来自上科大Wen Liu 博士的演示[1],“川普“也能畅快打上篮球了:视频详见:https://www.zhihu.com/zvideo/1319066582795075584对于阿里妈妈广告系统,图像驱动同样展现出不错的应用前景。如以下两组图片所示,对于淘宝商品,应用图像驱动技术,可以为原本静态的商品图制作动效,这样自带动效的创意更加吸引用户。关于图像驱动用于淘宝商品动效生成的技术细节不在本文讨论范围,具体参见我们去年ACM MM的DemoPaper[2]。
[1] Liu, Wen. "impersonator-你的舞蹈我来跳." 知乎https://zhuanlan.zhihu.com/p/332821774.[2] Xu, Borun, et al. "Move As You Like: Image Animation in E-Commerce Scenario." ACM Multimedia 2021.[3] Siarohin, Aliaksandr, et al. "First order motion model for image animation." NeurlPS 2019.[4] Felzenszwalb, Pedro F., et al. "Object detection with discriminatively trained part-based models." TPAMI 2010.[5] Siarohin, Aliaksandr, et al. "Animating arbitrary objects via deep motion transfer." CVPR 2019.[6] Siarohin, Aliaksandr, et al. "Motion representations for articulated animation." CVPR 2021.[7] Zablotskaia, Polina, et al. "Dwnet: Dense warp-based network for pose-guided human video generation." BMVC 2019.[8] Nagrani, Arsha, Joon Son Chung, and Andrew Zisserman. "Voxceleb: a large-scale speaker identification dataset." arXiv 2017 .