Can AI predict animal movements? Filling gaps in animal trajectories using inverse reinforcement learning, Ecosphere,
Modeling sensory-motor decisions in natural behavior, PLoS Comp. Biol.
4. Modeling sensory-motor decisions in
natural behavior
R. Zhang , S. Zhang, M. H. Tong, Y. Cui, C. A. Rothkopf, D. H. Ballard,
M. M. Hayhoe
PLoS Computational Biology, 2018
8. 割引率とは
• 報酬が有界なら、割引積算報酬も有界なので
扱いやすい
• Prediction of immediate and future rewards
differentially recruits cortico-basal ganglia loops
The robot does not move
towards the battery
The robot tries to catch
the battery
large 𝜸
small 𝜸𝑟 ≤ 𝑅max
𝑡
𝛾 𝑡
𝑟𝑡 ≤
𝑅max
1 − 𝛾
[Tanaka et al., 2004]
11. 被験者の行動と逆強化学習の推定結果の比較
• 黒線: 被験者の行動、緑線: 推定された方策から生成、3人の被験
者
Task1: Follow the path only
Task2: Follow the path and avoid obstacles
Task3: Follow the path and collect the targets
Task4: Follow, avoid, and collect together
16. Can AI predict animal movements? Filling
gaps in animal trajectories using inverse
reinforcement learning
T. Hirakawa, T. Yamashita, T. Tamaki, H. Fujiyoshi, Y. Umezu,
I. Takeuchi, S. Matsumoto, and K. Yoda
Ecosphere, 2018
25. References
• Doya K. (2008). Modulators of decision making. Nature neuroscience, 11(4):410–416.
• Hirakawa, T., Yamashita, T., Tamaki, T., Fujiyoshi, H., Umezu, Y., Takeuchi, I., Matsumoto, S., and
Yoda, K. (2018). Can AI predict animal movements? Filling gaps in animal trajectories using inverse
reinforcement learning. Ecosphere.
• Tanaka, S.C., Doya, K., Okada, G., Ueda, K., Okamoto, Y., and Yamawaki, S. (2004). Prediction of
immediate and future rewards differentially recruits cortico-basal ganglia loops. Nature
Neuroscience, 7(8): 887-893.
• Zhang , R., Zhang, S., Tong, M. H., Cui, Y., Rothkopf, C. A., Ballard, D. H., and Hayhoe, M. M. (2018).
Modeling sensory-motor decisions in natural behavior. PLoS Computational Biology.
• Ziebart, B., et al. (2008). Maximum entropy inverse reinforcement learning. In Proc. of AAAI.