前沿综述:面向深度学习的集体智能
导语
在过去的十年里,我们见证了深度学习的崛起,逐渐在人工智能领域占据主导地位。人工神经网络以及具有大内存的硬件加速器的进步,加上大型数据集的可用性,使得从业者能够训练和部署复杂的神经网络模型,并在跨越计算机视觉、自然语言处理和强化学习等多个领域的任务上取得最先进的性能。然而,随着这些神经网络变得更大、更复杂、应用更广泛,深度学习模型的基本问题变得更加明显。众所周知,最先进的深度学习模型存在各种问题,包括稳健性差、无法适应新的任务设置,以及需要严格和不灵活的配置假设。通常在自然界观察到的集体行为,往往会产生稳健、适应性强、对环境配置的假设不那么严格的系统。集体智能(Collective Intelligence)作为一个领域,研究的是从许多个体的互动中涌现的群体智能(Group Intelligence)。在这个领域中,诸如自组织、涌现行为、群优化和元胞自动机等思想被开发出来,以模拟和解释复杂系统。因此,研究者很自然地会将这些想法融入到较新的深度学习方法中。在这篇综述中,作者将提供一个涉及复杂系统的神经网络研究历史背景,并强调现代深度学习研究中几个活跃的领域,这些领域结合了集体智能的原则来提高其能力。我们希望这篇评论可以作为复杂系统和深度学习社区之间的桥梁。
关键词:集体智能,深度学习,多主体模型,强化学习
David Ha, Yujin Tang | 作者
刘志航 | 翻译
刘培源 | 审校
邓一雪 | 编辑
论文题目:Collective intelligence for deep learning: A survey of recent developments论文链接:https://dl.acm.org/doi/10.1177/26339137221114874
1. 引言
2. 背景:集体智能
3. 历史背景:细胞式类神经网络
4. 深度学习的集体智能
5. 讨论
1. 引言
1. 引言
图1. AlexNet(Krizhevsky et al. 2012)的神经网络结构,它是2012年ImageNet竞赛的冠军。
图2. 左图:位于阿尔坎塔拉的图拉真桥,由罗马人于公元106年建造(维基百科,2022)。右图:行军蚁组成的桥(Jenal, 2011)。
图3. GPU硬件的最新进展能够对成千上万的机器人模型进行逼真的3D模拟(Heiden et al. 2021),如本图所示,来自(Rudin et al. 2021)。这样的进展为大规模的三维模拟人工主体打开了大门,这些主体可以相互交流,并集体发展智能行为。
2. 背景:集体智能
2. 背景:集体智能
3. 历史背景:细胞式类神经网络
3. 历史背景:细胞式类神经网络
图4. 左图:二维细胞式类神经的典型配置(Liu et al. 2020)。右图:深度学习和细胞式类神经网络这两个词在一段时间内的谷歌趋势。
4. 深度学习的集体智能
4. 深度学习的集体智能
图5. 由(Randazzo et al. 2020)创建的训练有素的神经元胞自动机识别MNIST数字,也可作为互动网络演示。每个细胞只允许看到一个像素的内容,并与它的邻居交流。随着时间的推移,将形成一个共识,即哪个数字是最有可能的像素,但有趣的是,根据预测的像素的位置,可能会产生分歧。
图6. 神经元胞自动机也被应用于Minecraft实体的再生。在这项工作中,作者的表明不仅使Minecraft建筑、树木的再生,而且可以使游戏中的简单功能机器,如蠕虫状生物的再生,在切成两半时甚至可以再生为两个不同的生物。
图7. 二维和三维的软体机器人模拟实例。每个细胞代表一个单独的神经网络,具有局部感知功能,产生局部动作,包括与邻近的细胞交流。训练这些系统来完成各种运动任务,不仅涉及到训练神经网络,而且还涉及到形成主体形态的软体细胞的设计和放置。图自(Horibe et al. 2021)。
图8. 传统的强化学习方法为具有固定形态的特定机器人训练一个特定的策略。但最近的工作,如这里显示的(Huang et al. 2020)试图训练一个单一的模块化神经网络,负责控制机器人的单一部分。因此,每个机器人的全局策略是这些相同的模块化神经网络协调的结果。他们表明,这样的系统可以在各种不同的骨架结构上进行泛化,从跳跃者到四足动物,甚至是一些未见过的形态。
图9. 自组织也使强化学习环境中的系统能够为特定的任务自我配置它自己的设计。在(Pathak et al. 2019)中,作者探索了这种动态和模块化的主体,并表明它们不仅可以泛化到未知的环境,还可以泛化到由额外模块组成的未知的形态。
图10. 利用自组织和注意力的特性,(Tang and Ha, 2021)中的作者研究了强化学习主体,它们将其观察结果视为任意有序、长度可变的感官输入列表。他们将CarRacing 和 Atari Pong(Brockman et al. 2016; Tang et al. 2020)等视觉任务中的输入划分为小斑块的二维网格,并打乱了它们的顺序(左)。他们还在连续控制任务中增加了许多额外的冗余噪声输入通道(Freeman et al. 2019),以打乱顺序(右),主体必须学习识别哪些输入是有用的。系统中的每个感觉神经元都收到一个特定的输入流,通过协调完成手头的任务。
图11. MAgent(Zheng et al. 2018)是一套环境框架,在网格世界中大量的像素主体在战斗或其他竞争场景中进行互动。与大多数专注于单一主体或只有少数主体的强化学习研究的平台不同,他们的目标是支持扩展到数百万主体的强化学习研究。这个平台的环境现在作为PettingZoo(Terry et al. 2020)开源库的一部分进行维护,用于多主体强化学习研究。
图12. Neural MMO(Suarez et al. 2021)是一个在程序化生成的虚拟世界中模拟主体群体的平台,以支持多主体研究,同时保持其在计算上的要求。用户从一组提供的游戏系统中选择,为他们的具体研究问题创造环境--支持多达一千种主体和几千个时间步长的一平方公里的地图。该项目正在积极开发中,有大量的文档和工具,为研究人员提供记录和可视化工具。截至发稿时,这个平台将在2021年的Neur IPS会议上进行演示。
图13. Sandler(2021)和Kirsch和Schmidhuber(2020)最近的工作试图概括人工神经网络的公认概念,每个神经元可以持有多个状态而不是一个标量值,每个突触的功能是双向的,以促进学习和推理。在这个图中,(Kirsch and Schmidhuber, 2020)使用一个相同的循环神经网络(RNN)(具有不同的内部隐藏状态)来模拟每个突触,并表明网络可以通过简单地向前运行循环神经网络来训练,而不使用反向传播。
5. 讨论
5. 讨论
参考文献
Alam M, Samad MD, Vidyaratne L, et al. (2020) Survey on deep neural networks in speech and vision systems. Neurocomputing 417: 302–321. Baker B, Kanitscheider I, Markov T, et al. (2019) Emergent Tool Use from Multi-Agent Autocurricula. arXiv preprint arXiv:1909. 07528. Bansal T, Pachocki J, Sidor S, et al. (2017) Emergent Complexity via Multi-Agent Competition. arXiv preprint arXiv:1710. 03748. Bhatia J, Jackson H, Tian Y, et al. (2021) Evolution gym: A large-scale benchmark for evolving soft robots. In: Advances in Neural Information Processing Systems. Curran Associates, Inc. Brockman G, Cheung V, Pettersson L, et al. (2016) Openai Gym. arXiv preprint arXiv:1606. 01540. Brown TB, Mann B, Ryder N, et al. (2020) Language Models Are Few-Shot Learners. arXiv preprint arXiv:2005. 14165. Cheney N, MacCurdy R, Clune J, et al. (2014) Unshackling evolution: evolving soft robots with multiple materials and a powerful generative encoding. ACM SIGEVOlution 7(1): 11–23. Chollet F, et al. (2015) Keras. Chua LO, Roska T (2002) Cellular Neural Networks and Visual Computing: Foundations and Applications. Cambridge University Press. Chua LO, Yang L (1988a) Cellular neural networks: Applications. IEEE Transactions on Circuits and Systems 35(10): 1273–1290. Chua LO, Yang L (1988b) Cellular neural networks: Theory. IEEE Transactions on Circuits and Systems 35(10): 1257–1272. Conway J, et al. (1970) The game of life. Scientific American 223(4): 4. Daigavane A, Ravindran B, Aggarwal G (2021) Understanding Convolutions on Graphs. Distill. https://distill.pub/2021/understanding-gnns Deneubourg J-L, Goss S (1989) Collective patterns and decision-making. Ethology Ecology & Evolution 1(4): 295–311. Deng J, Dong W, Socher R, et al. (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Dorigo M, Bonabeau E, Theraulaz G (2000) Ant algorithms and stigmergy. Future Generation Computer Systems 16(8): 851–871. Foerster JN, Assael YM, De Freitas N, et al. (2016) Learning to Communicate with Deep Multi-Agent Reinforcement Learning. arXiv preprint arXiv:1605.06676. Freeman CD, Metz L, Ha D (2019) Learning to Predict without Looking Ahead: World Models without Forward Prediction. Gilpin W (2019) Cellular automata as convolutional neural networks. Physical Review E 100(3): 032402. GoraS L, Chua LO, Leenaerts D (1995) Turing patterns in cnns. i. once over lightly. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications 42(10): 602–611. Grattarola D, Livi L, Alippi C (2021) Learning Graph Cellular Automata. Ha D (2018) Reinforcement Learning for Improving Agent Design. Ha D (2020) Slime Volleyball Gym Environment. https://github.com/hardmaru/slimevolleygym Ha D, Schmidhuber J (2018) Recurrent world models facilitate policy evolution. Advances in Neural Information Processing Systems 31: 2451–2463. https://worldmodels.github.io Hamann H (2018) Swarm Robotics: A Formal Approach. Springer. He K, Zhang X, Ren S, et al. (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778. Heiden E, Millard D, Coumans E, et al. (2021) NeuralSim: Augmenting differentiable simulators with neural networks. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Hill A, Raffin A, Ernestus M, et al. (2018) Stable Baselines. Hooker S (2020) The Hardware Lottery. arXiv preprint arXiv:2009.06489. Horibe K, Walker K, Risi S (2021) Regenerating soft robots through neural cellular automata. In: EuroGP, pp. 36–50. Huang W, Mordatch I, Pathak D (2020) One policy to control them all: Shared modular policies for agent-agnostic control. In: International Conference on Machine Learning, pp. 4455–4464. Jabbar A, Li X, Omar B (2021) A survey on generative adversarial networks: Variants, applications, and training. ACM Computing Surveys (CSUR) 54(8): 1–49. Jaderberg M, Czarnecki WM, Dunning I, et al. (2019) Human-level performance in 3d multiplayer games with population-based reinforcement learning. Science 364(6443): 859–865. Jenal M (2011) What Ants Can Teach Us about the Market. Joachimczak M, Suzuki R, Arita T (2016) Artificial metamorphosis: Evolutionary design of transforming, soft-bodied robots. Artificial Life 22(3): 271–298. Kirsch L, Schmidhuber J (2020) Meta Learning Backpropagation and Improving it. arXiv preprint arXiv:2012. 14905. Kozek T, Roska T, Chua LO (1993) Genetic algorithm for cnn template learning. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications 40(6): 392–402. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25: 1097–1105. Lajad R, Moreno E, Arenas A (2021) Young honeybees show learned preferences after experiencing adulterated pollen. Scientific Reports 11(1): 1–11. Leimeister JM (2010) Collective intelligence. Business & Information Systems Engineering 2(4): 245–248. Lévy P (1997) Collective Intelligence. Liu J-B, Raza Z, Javaid M (2020) Zagreb connection numbers for cellular neural networks. Discrete Dynamics in Nature and Society 2020: 1–8. Liu S, Lever G, Merel J, et al. (2019) Emergent Coordination through Competition. arXiv preprint arXiv:1902.07151 Mataric MJ (1993) Designing emergent behaviors: From local interactions to collective intelligence. In: Proceedings of the Second International Conference on Simulation of Adaptive Behavior, pp. 432–441. Mnih V, Kavukcuoglu K, Silver D, et al. (2015) Human-level control through deep reinforcement learning. Nature 518(7540): 529–533. Mordvintsev A, Randazzo E, Niklasson E, et al. (2020) Growing Neural Cellular Automata. Distill. Ohsawa S, Akuzawa K, Matsushima T, et al. (2018) Neuron as an Agent. OroojlooyJadid A, Hajinezhad D (2019) A Review of Cooperative Multi-Agent Deep Reinforcement Learning. arXiv preprint arXiv:1908.03963. Ott J (2020) Giving up Control: Neurons as Reinforcement Learning Agents. arXiv preprint arXiv:2003.11642. Palm RB, Duque MG, Sudhakaran S, et al. (2022) Variational neural cellular automata. In: International Conference on Learning Representations. Pathak D, Lu C, Darrell T, et al. (2019) Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity. arXiv preprint arXiv:1902.05546. Peng Z, Hui KM, Liu C, et al. (2021) Learning to simulate self-driven particles system with coordinated policy optimization. Advances in Neural Information Processing Systems 34. Pickering A (2010) The Cybernetic Brain. University of Chicago Press. Qin Y, Feng M, Lu H, et al. (2018) Hierarchical cellular automata for visual saliency. International Journal of Computer Vision 126(7): 751–770. Qu X, Sun Z, Ong YS, et al. (2020) Minimalistic attacks: How little it takes to fool deep reinforcement learning policies. IEEE Transactions on Cognitive and Developmental Systems 13: 806–817. Radford A, Kim JW, Hallacy C, et al. (2021) Learning Transferable Visual Models from Natural Language Supervision. arXiv preprint arXiv:2103.00020. Radford A, Narasimhan K, Salimans T, et al. (2018) Improving Language Understanding by Generative Pre-training. Radford A, Wu J, Child R, et al. (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8): 9. Randazzo E, Mordvintsev A, Niklasson E, et al. (2020) Self-classifying Mnist Digits. Distill. Resnick C, Eldridge W, Ha D, et al. (2018) Pommerman: A Multi-Agent Playground. arXiv preprint arXiv:1809.07124. Rubenstein M, Cornejo A, Nagpal R (2014) Programmable self-assembly in a thousand-robot swarm. Science 345(6198): 795–799. Rudin N, Hoeller D, Reist P, et al. (2021) Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning. arXiv preprint arXiv:2109.11978. Sanchez-Lengeling B, Reif E, Pearce A, et al. (2021) A gentle introduction to graph neural networks. Distill 6(9): e33. Sandler M, Vladymyrov M, Zhmoginov A, et al. (2021) Meta-learning bidirectional update rules. In: International Conference on Machine Learning, pp. 9288–9300. Sandler M, Zhmoginov A, Luo L, et al. (2020) Image Segmentation via Cellular Automata. arXiv preprint arXiv:2008.04965. Schilling MA (2000) Toward a general modular systems theory and its application to interfirm product modularity. Academy of Management Review 25(2): 312–334. Schilling MA, Steensma HK (2001) The use of modular organizational forms: An industry-level analysis. Academy of Management Journal 44(6): 1149–1168. Schmidhuber J (2014) Who Invented Backpropagation? More[DL2]. Schmidhuber J (2020) Metalearning Machines Learn to Learn (1987). https://people.idsia.ch/juergen/metalearning.html Schoenholz S, Cubuk ED (2020) Jax md: a framework for differentiable physics. Advances in Neural Information Processing Systems 33. Schweitzer F, Farmer JD (2003) Brownian Agents and Active Particles: Collective Dynamics in the Natural and Social Sciences. Springer. Volume 1. Seeley TD (2010) Honeybee Democracy. Princeton University Press. Silver D, Huang A, Maddison CJ, et al. (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587): 484–489. Simonyan K, Zisserman A (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556. Stahlberg F (2020) Neural machine translation: A review. Journal of Artificial Intelligence Research 69: 343–418. Stoy K, Brandt D, Christensen DJ, et al. (2010) Self-reconfigurable Robots: An Introduction. Suarez J, Du Y, Isola P, et al. (2019) Neural Mmo: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents. arXiv preprint arXiv:1903.00784. Suarez J, Du Y, Zhu C, et al. (2021) The neural mmo platform for massively multiagent research. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track. Sudhakaran S, Grbic D, Li S, et al. (2021) Growing 3d Artefacts and Functional Machines with Neural Cellular Automata. arXiv preprint arXiv:2103.08737. Sumpter DJ (2010) Collective Animal Behavior. Princeton University Press. Surowiecki J (2005) The Wisdom of Crowds. Anchor. Tan X, Qin T, Soong F, et al. (2021) A Survey on Neural Speech Synthesis. arXiv preprint arXiv:2106.15561. Tang Y, Ha D (2021) The sensory neuron as a transformer: Permutation-invariant neural networks for reinforcement learning. In: Thirty-Fifth Conference on Neural Information Processing Systems. https://attentionneuron.github.io Tang Y, Nguyen D, Ha D (2020) Neuroevolution of self-interpretable agents. In: Proceedings of the Genetic and Evolutionary Computation Conference. Tang Y, Tian Y, Ha D (2022) Evojax: Hardware-Accelerated Neuroevolution. arXiv preprint arXiv:2202.05008. Tapscott D, Williams AD (2008) Wikinomics: How Mass Collaboration Changes Everything. Penguin. Terry JK, Black B, Jayakumar M, et al. (2020) Pettingzoo: Gym for Multi-Agent Reinforcement Learning. arXiv preprint arXiv:2009.14471. Toner J, Tu Y, Ramaswamy S (2005) Hydrodynamics and phases of flocks. Annals of Physics 318(1): 170–244. Vinyals O, Babuschkin I, Czarnecki WM, et al. (2019) Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782): 350–354. Wang T, Liao R, Ba J, et al. (2018) Nervenet: Learning structured policy with graph neural networks. In: International Conference on Learning Representations. Wang Z, She Q, Ward TE (2021) Generative adversarial networks in computer vision: A survey and taxonomy. ACM Computing Surveys (CSUR) 54(2): 1–38. Wikipedia (2022) Trajan’s Bridge at Alcantara. Wikipedia. Wolfram S (2002) A New Kind of Science. Champaign, IL: Wolfram media. Volume 5. Wu Z, Pan S, Chen F, et al. (2020) A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems 32(1): 4–24. Zhang D, Choi C, Kim J, et al. (2021) Learning to generate 3d shapes with generative cellular automata. In: International Conference on Learning Representations. Zheng L, Yang J, Cai H, et al. (2018) Magent: A many-agent reinforcement learning platform for artificial collective intelligence. Proceedings of the AAAI Conference on Artificial Intelligence 32.
(参考文献可上下滑动查看)
复杂科学最新论文
推荐阅读
前沿综述:集体智能与深度学习的交叉进展 集体智能如何增强人工智能?未来智能社会一瞥 集体心智:社会网络拓扑塑造集体认知 《张江·复杂科学前沿27讲》完整上线! 成为集智VIP,解锁全站课程/读书会 加入集智,一起复杂!