【源头活水】深入理解:迁移强化学习之Successor Representation
“问渠那得清如许,为有源头活水来”,通过前沿领域知识的学习,从其他研究领域得到启发,对研究问题的本质有更清晰的认识和理解,是自我提高的不竭源泉。为此,我们特别精选论文阅读笔记,开辟“源头活水”专栏,帮助你广泛而深入的阅读科研文献,敬请关注。
地址:https://www.zhihu.com/people/liu-lan-25-44
01
02
03
04
05
[1] Dayan, Peter. "Improving generalization for temporal difference learning: The successor representation." Neural Computation 5, no. 4 (1993): 613-624.
[2] Gershman, Samuel J. "The successor representation: its computational logic and neural substrates." Journal of Neuroscience 38, no. 33 (2018): 7193-7200.
[3] Kulkarni, Tejas D., Ardavan Saeedi, Simanta Gautam, and Samuel J. Gershman. "Deep successor reinforcement learning." arXiv preprint arXiv:1606.02396 (2016).
[4] Zhang, Jingwei, Jost Tobias Springenberg, Joschka Boedecker, and Wolfram Burgard. "Deep reinforcement learning with successor features for navigation across similar environments." In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2371-2378. IEEE, 2017.
[5] Barreto, André, Will Dabney, Rémi Munos, Jonathan J. Hunt, Tom Schaul, Hado P. van Hasselt, and David Silver. "Successor features for transfer in reinforcement learning." In Advances in neural information processing systems, pp. 4055-4065. 2017.
[6] Barreto, Andre, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel Mankowitz, Augustin Zidek, and Remi Munos. "Transfer in deep reinforcement learning using successor features and generalised policy improvement." In International Conference on Machine Learning, pp. 501-510. 2018.
[7] Momennejad, Ida, Evan M. Russek, Jin H. Cheong, Matthew M. Botvinick, Nathaniel Douglass Daw, and Samuel J. Gershman. "The successor representation in human reinforcement learning." Nature Human Behaviour 1, no. 9 (2017): 680-692.
[8] Lehnert, Lucas, Stefanie Tellex, and Michael L. Littman. "Advantages and limitations of using successor features for transfer in reinforcement learning." arXiv preprint arXiv:1708.00102 (2017).
[9] Barreto, André, Shaobo Hou, Diana Borsa, David Silver, and Doina Precup. "Fast reinforcement learning with generalized policy updates." Proceedings of the National Academy of Sciences (2020).
本文目的在于学术交流,并不代表本公众号赞同其观点或对其内容真实性负责,版权归原作者所有,如有侵权请告知删除。
“源头活水”历史文章
PSS:更简单有效的End-to-End检测
Transfer in DRL Using SFs & GPI
图上如何学习组合优化算法
AlphaGo Zero技术梳理
论文解读:Successor Features for TRL
离线元强化学习新工作(ICLR 2021 Poster)
TransGAN:纯粹而又强大
Transformer 的一些说明
DVERGE:通过“缺陷”多样化构建鲁棒性集成模型
多模态智能:表示学习,信息融合和应用
Automatic Car Damage Assessment System:...
PAMTRI:结合仿真数据+姿态信息的车辆ReID算法
自动增强:从数据中学习增强策略
BoxInst—只用bbox标注进行实例分割
图对抗攻击防御技术--RGCN,Pro-GNN,Vaccinated GCN
更多源头活水专栏文章,
请点击文章底部“阅读原文”查看
分享、在看,给个三连击呗!