SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
1
DEEP LEARNING JP
[DL Papers]
http://deeplearning.jp/
“Learning deep mean field games for modeling large
population behavior"
or the intersection of machine learning and modeling collective processes
• : Learning deep mean fieldgamse for modeling large population behavior
• : Jiachen Yang, Xiaojing Ye, Rakshit Trivedi, Huan Xu, Hongyuan Zha
• Georgia Institute of Technology and Georgia State University
• : ICLR2018 (Oral)
• Scores: 10, 8, 8
• :
• Collective Behavior
• :
• Collective Behavior( )
• Mean Field Games(MFG)
• Pros:
• Cons: (= )
• : Inference of MFG via Markov Decision Process(MDP) Optimization
• MFG(discrete-time graph-state MFG) MDP
• MFG
•
• Twitter VAR, RNN
• :
• Arab Spring , Black Lives Matter movement, fake news, etc.
• 1:
• "Nothing takes place in the world whose meaning is not that of some maximum or minimum." by Euler
• or
• ( )
• https://openreview.net/forum?id=HktK4BeCZ
• cf. ,
• 2: ⇄
•
topic1 topic2 topic1 topic2
:
topic1
•
• MFG(discrete-time graph-state)
• e.g., , etc.
topic1 topic2 topic1 topic2
:
topic1
•
1. ( ⇄ )
2.
3.
• Mean Filed Game
1. 2. 3.
Time-Series-Analysis
(e.g., VAR)
Network-Analysis , ?,
Mean Field Game
Mean Field Game (MFG)
•
ØN-player ! → ∞
Ø
• e.g.,
•
•
•
• opinion network
• etc. ( : Gueant+ 2011)
Mean Field Game (MFG)
• MFG (Guent 2009):
•
• ! → ∞
•
• Social Interactions of the mean field type
•
•
Mean Field Game (MFG)
• Social Interactions of the mean field type
DL
1 5 5 9
DL
1 5 5 9
5
5
5
•
• N
……
I
( ) Multi Agent Reinforcement Learning (MARL)
• Mean Field Multi-Agent Reinforcement Learning (Yang+ 2018)
• MARL
ØMARL j : !"
#
$, & = (#
$, & + *+,-~/(,-|&,,)[4"
#
($5
)]
Ø(#
$, & , 7($5
|&, $) &
Ø
•
Mean Field Game (MFG)
• MFG : (=- )
ØMFG agnostic
Ø
Ø MFG Toy-Problem
• Contribution:
• MFG Toy-Problem
Discrete-time graph-state MFG
• : Discrete-time graph-state MFG
• d
• !" # : t i
• $"%
&
: t, t+1 i j
• (Mean)
topic1 topic2 topic1 topic2
:
!' # =
2
3
!+ # =
1
3 !' # + 1 =
7
9
!' # + 2 =
2
9
$',+
&
=
1
6
, $+,'
&
=
2
3
Discrete-time graph-state MFG
• : Discrete-time graph-state MFG
• !"($ % , '" % ):
• $ % = $" % "*+
,
'"
-
= '",+
-
, … , '",,
-
i
• !"($ % , ' % )= !"($ % , '" % ) (where '-
= '+
-
, … , ',
-
)
topic1 topic2 topic1 topic2
:
!/($ % , '/ % )
$ %
2 '/ %
2 ( ⇄ )
Discrete-time graph-state MFG
• MFG
• !"
#
= max
()
*
[," -#
, /"
#
+ ∑2 /"2
#
!2
#34
] (backward Hamilton-Jacobi-Bellman equation, HJB)
• -"
#34
= ∑2 /2"
#
-2
#
(forward Fokker-Planck equation)
• !"
6
: t i ( )
• -7
, !8
, ," -#
, /"
#
Dynamic Programing Trajectory -#
, !#
#97
8
• ," -#
, /"
#
•
ØHJB: Nash-Maximizer /"
#
Inference on MFG via MDP optimization
•
…
MDP
MFG Trajectory
Inference on MFG via MDP optimization
• MFG MDP
•
• MFG MDP
Ø
Ø MFG Forward-Path
• Settings
• States: !"
, n
• Actions: #"
, n
• Dynamics: !$
"%&
= ∑) #)$
"
!)
"
• Reward: * !"
, #"
= ∑$,&
-
!$
" ∑),&
-
#$)
"
.$)(!"
, #$
"
),
Inference on MFG via MDP optimization
• : MDP
MFG HJB, Fokker-Planck
HJB
Fokker-Planck
!
Nash-Maximizer!"
Inference on MFG via MDP optimization
•
1. ( ⇄ )
2.
3.
• MFG MDP
Øsingle-agent RL V∗
(%&
) = max
,
[. %&
, 0 + V∗
%&23
]
Ø ⇄
ØMDP
Experiments
• : Twitter
• d = 15 topics 15 ( ( , etc.) )
• n_timesteps=16, 16 1episode
• n_episodes = 27, 27
• Guided Cost Learning (Finn+ 2016)
• Forward-Path
•
• Deep
• : Vector Autoregression(VAR), RNN
Experiments
• state-action
•
S0:
A0:
S2:
Experiments
• Jensen-Shannon-Divergence
VAE, RNN
• MFG
( ⇄ , )
•
MFG RNN
• RNN
Experiments
•
• ( )
Conslusion
•
• MFG MDP
• MFG Toy-Problem
•
• !"($ % , ' % )= !"($ % , '" % )
• ( )
• Network-Based Social
Dynamics Model
•
• MFG VAR
or
•
References
• Gueant, Olivier. (2009). A reference case for mean field games models. Journal de
Mathématiques Pures et Appliquées. 92. 276-294. 10.1016/j.matpur.2009.04.008.
• Guéant O., Lasry JM., Lions PL. (2011) Mean Field Games and Applications. In: Paris-
Princeton Lectures on Mathematical Finance 2010. Lecture Notes in Mathematics, vol
2003. Springer, Berlin, Heidelberg
• Chelsea Finn, Sergey Levine, and Pieter Abbeel. Guided cost learning: Deep inverse
optimal control via policy optimization. In International Conference on Machine Learning,
pp. 49–58, 2016.
• Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, Jun Wang, Mean Field
Multi-Agent Reinforcement Learning, 2018, arxiv
• MFG :
• https://link.springer.com/content/pdf/10.1007%2Fs11537-007-0657-8.pdf
• MFG :
• https://www.sciencedirect.com/science/article/pii/S002178240900138X
• :
• https://terrytao.wordpress.com/2010/01/07/mean-field-equations/
• The causal mechanism for such waves is somewhat strange, though, due to the presence of the
backward propagating equation – in some sense, the wave continues to propagate because the
audience members expect it to continue to propagate, and act accordingly. (One wonders if these
sorts of equations could provide a model for things like asset price bubbles, which seem to be
governed by a similar mechanism.)

Weitere ähnliche Inhalte

Ähnlich wie [DL輪読会]Learning Deep Mean Field Games for Modeling Large Population Behavior

機械学習モデルの判断根拠の説明
機械学習モデルの判断根拠の説明機械学習モデルの判断根拠の説明
機械学習モデルの判断根拠の説明Satoshi Hara
 
K02-salen: Systems Thinking in Action 2011
K02-salen: Systems Thinking in Action 2011K02-salen: Systems Thinking in Action 2011
K02-salen: Systems Thinking in Action 2011pegasuscomm
 
ブラックボックス最適化とその応用
ブラックボックス最適化とその応用ブラックボックス最適化とその応用
ブラックボックス最適化とその応用gree_tech
 
Guest Lecture: SenSec - Mobile Security through BehavioMetrics
Guest Lecture: SenSec - Mobile Security through BehavioMetrics Guest Lecture: SenSec - Mobile Security through BehavioMetrics
Guest Lecture: SenSec - Mobile Security through BehavioMetrics Jiang Zhu
 
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningYuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningAI Frontiers
 
[한국어] Safe Multi-Agent Reinforcement Learning for Autonomous Driving
[한국어] Safe Multi-Agent Reinforcement Learning for Autonomous Driving[한국어] Safe Multi-Agent Reinforcement Learning for Autonomous Driving
[한국어] Safe Multi-Agent Reinforcement Learning for Autonomous DrivingKiho Suh
 
Keynote at the 2018 SIGGRAPH Conference on Motion, Interaction and Games
Keynote at the 2018 SIGGRAPH Conference on Motion, Interaction and GamesKeynote at the 2018 SIGGRAPH Conference on Motion, Interaction and Games
Keynote at the 2018 SIGGRAPH Conference on Motion, Interaction and GamesRogelio E. Cardona-Rivera
 
Machine learning on Go Code
Machine learning on Go CodeMachine learning on Go Code
Machine learning on Go Codesource{d}
 
Learning from games : Dr Joanne O'Mara
Learning from games : Dr Joanne O'MaraLearning from games : Dr Joanne O'Mara
Learning from games : Dr Joanne O'MaraPublicLibraryServices
 
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph CompletionNaomi Shiraishi
 
從行動廣告大數據觀點談 Big data 20150916
從行動廣告大數據觀點談 Big data   20150916從行動廣告大數據觀點談 Big data   20150916
從行動廣告大數據觀點談 Big data 20150916Craig Chao
 
Notes on Reinforcement Learning - v0.1
Notes on Reinforcement Learning - v0.1Notes on Reinforcement Learning - v0.1
Notes on Reinforcement Learning - v0.1Joo-Haeng Lee
 
Learning and Modern Programming Languages
Learning and Modern Programming LanguagesLearning and Modern Programming Languages
Learning and Modern Programming LanguagesRay Toal
 
PyDX Presentation about Python, GeoData and Maps
PyDX Presentation about Python, GeoData and MapsPyDX Presentation about Python, GeoData and Maps
PyDX Presentation about Python, GeoData and MapsHannes Hapke
 
Location and Language in Social Media (Stanford Mobi Social Invited Talk)
Location and Language in Social Media (Stanford Mobi Social Invited Talk)Location and Language in Social Media (Stanford Mobi Social Invited Talk)
Location and Language in Social Media (Stanford Mobi Social Invited Talk)Ed Chi
 
Mathematics and technology(2)
Mathematics and technology(2)Mathematics and technology(2)
Mathematics and technology(2)Jonathan Martin
 
Understanding and improving games through machine learning - Natasha Latysheva
Understanding and improving games through machine learning - Natasha LatyshevaUnderstanding and improving games through machine learning - Natasha Latysheva
Understanding and improving games through machine learning - Natasha LatyshevaLauren Cormack
 
Duplicates everywhere (Kiev)
Duplicates everywhere (Kiev)Duplicates everywhere (Kiev)
Duplicates everywhere (Kiev)Alexey Grigorev
 
Haskell in the Real World
Haskell in the Real WorldHaskell in the Real World
Haskell in the Real Worldosfameron
 
Introduction to machine_learning
Introduction to machine_learningIntroduction to machine_learning
Introduction to machine_learningKiran Lonikar
 

Ähnlich wie [DL輪読会]Learning Deep Mean Field Games for Modeling Large Population Behavior (20)

機械学習モデルの判断根拠の説明
機械学習モデルの判断根拠の説明機械学習モデルの判断根拠の説明
機械学習モデルの判断根拠の説明
 
K02-salen: Systems Thinking in Action 2011
K02-salen: Systems Thinking in Action 2011K02-salen: Systems Thinking in Action 2011
K02-salen: Systems Thinking in Action 2011
 
ブラックボックス最適化とその応用
ブラックボックス最適化とその応用ブラックボックス最適化とその応用
ブラックボックス最適化とその応用
 
Guest Lecture: SenSec - Mobile Security through BehavioMetrics
Guest Lecture: SenSec - Mobile Security through BehavioMetrics Guest Lecture: SenSec - Mobile Security through BehavioMetrics
Guest Lecture: SenSec - Mobile Security through BehavioMetrics
 
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningYuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning
 
[한국어] Safe Multi-Agent Reinforcement Learning for Autonomous Driving
[한국어] Safe Multi-Agent Reinforcement Learning for Autonomous Driving[한국어] Safe Multi-Agent Reinforcement Learning for Autonomous Driving
[한국어] Safe Multi-Agent Reinforcement Learning for Autonomous Driving
 
Keynote at the 2018 SIGGRAPH Conference on Motion, Interaction and Games
Keynote at the 2018 SIGGRAPH Conference on Motion, Interaction and GamesKeynote at the 2018 SIGGRAPH Conference on Motion, Interaction and Games
Keynote at the 2018 SIGGRAPH Conference on Motion, Interaction and Games
 
Machine learning on Go Code
Machine learning on Go CodeMachine learning on Go Code
Machine learning on Go Code
 
Learning from games : Dr Joanne O'Mara
Learning from games : Dr Joanne O'MaraLearning from games : Dr Joanne O'Mara
Learning from games : Dr Joanne O'Mara
 
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
 
從行動廣告大數據觀點談 Big data 20150916
從行動廣告大數據觀點談 Big data   20150916從行動廣告大數據觀點談 Big data   20150916
從行動廣告大數據觀點談 Big data 20150916
 
Notes on Reinforcement Learning - v0.1
Notes on Reinforcement Learning - v0.1Notes on Reinforcement Learning - v0.1
Notes on Reinforcement Learning - v0.1
 
Learning and Modern Programming Languages
Learning and Modern Programming LanguagesLearning and Modern Programming Languages
Learning and Modern Programming Languages
 
PyDX Presentation about Python, GeoData and Maps
PyDX Presentation about Python, GeoData and MapsPyDX Presentation about Python, GeoData and Maps
PyDX Presentation about Python, GeoData and Maps
 
Location and Language in Social Media (Stanford Mobi Social Invited Talk)
Location and Language in Social Media (Stanford Mobi Social Invited Talk)Location and Language in Social Media (Stanford Mobi Social Invited Talk)
Location and Language in Social Media (Stanford Mobi Social Invited Talk)
 
Mathematics and technology(2)
Mathematics and technology(2)Mathematics and technology(2)
Mathematics and technology(2)
 
Understanding and improving games through machine learning - Natasha Latysheva
Understanding and improving games through machine learning - Natasha LatyshevaUnderstanding and improving games through machine learning - Natasha Latysheva
Understanding and improving games through machine learning - Natasha Latysheva
 
Duplicates everywhere (Kiev)
Duplicates everywhere (Kiev)Duplicates everywhere (Kiev)
Duplicates everywhere (Kiev)
 
Haskell in the Real World
Haskell in the Real WorldHaskell in the Real World
Haskell in the Real World
 
Introduction to machine_learning
Introduction to machine_learningIntroduction to machine_learning
Introduction to machine_learning
 

Mehr von Deep Learning JP

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving PlannersDeep Learning JP
 
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについてDeep Learning JP
 
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...Deep Learning JP
 
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-ResolutionDeep Learning JP
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxivDeep Learning JP
 
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLMDeep Learning JP
 
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo... 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...Deep Learning JP
 
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place RecognitionDeep Learning JP
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?Deep Learning JP
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究についてDeep Learning JP
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )Deep Learning JP
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...Deep Learning JP
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"Deep Learning JP
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "Deep Learning JP
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat ModelsDeep Learning JP
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"Deep Learning JP
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...Deep Learning JP
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...Deep Learning JP
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...Deep Learning JP
 
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...Deep Learning JP
 

Mehr von Deep Learning JP (20)

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
 
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
 
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
 
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
 
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
 
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo... 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
 
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
 

Kürzlich hochgeladen

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 

Kürzlich hochgeladen (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

[DL輪読会]Learning Deep Mean Field Games for Modeling Large Population Behavior

  • 1. 1 DEEP LEARNING JP [DL Papers] http://deeplearning.jp/ “Learning deep mean field games for modeling large population behavior" or the intersection of machine learning and modeling collective processes
  • 2. • : Learning deep mean fieldgamse for modeling large population behavior • : Jiachen Yang, Xiaojing Ye, Rakshit Trivedi, Huan Xu, Hongyuan Zha • Georgia Institute of Technology and Georgia State University • : ICLR2018 (Oral) • Scores: 10, 8, 8 • : • Collective Behavior
  • 3. • : • Collective Behavior( ) • Mean Field Games(MFG) • Pros: • Cons: (= ) • : Inference of MFG via Markov Decision Process(MDP) Optimization • MFG(discrete-time graph-state MFG) MDP • MFG • • Twitter VAR, RNN
  • 4. • : • Arab Spring , Black Lives Matter movement, fake news, etc. • 1: • "Nothing takes place in the world whose meaning is not that of some maximum or minimum." by Euler • or • ( ) • https://openreview.net/forum?id=HktK4BeCZ • cf. ,
  • 5. • 2: ⇄ • topic1 topic2 topic1 topic2 : topic1
  • 6. • • MFG(discrete-time graph-state) • e.g., , etc. topic1 topic2 topic1 topic2 : topic1
  • 7. • 1. ( ⇄ ) 2. 3. • Mean Filed Game 1. 2. 3. Time-Series-Analysis (e.g., VAR) Network-Analysis , ?, Mean Field Game
  • 8. Mean Field Game (MFG) • ØN-player ! → ∞ Ø • e.g., • • • • opinion network • etc. ( : Gueant+ 2011)
  • 9. Mean Field Game (MFG) • MFG (Guent 2009): • • ! → ∞ • • Social Interactions of the mean field type • •
  • 10. Mean Field Game (MFG) • Social Interactions of the mean field type DL 1 5 5 9 DL 1 5 5 9 5 5 5 • • N …… I
  • 11. ( ) Multi Agent Reinforcement Learning (MARL) • Mean Field Multi-Agent Reinforcement Learning (Yang+ 2018) • MARL ØMARL j : !" # $, & = (# $, & + *+,-~/(,-|&,,)[4" # ($5 )] Ø(# $, & , 7($5 |&, $) & Ø •
  • 12. Mean Field Game (MFG) • MFG : (=- ) ØMFG agnostic Ø Ø MFG Toy-Problem • Contribution: • MFG Toy-Problem
  • 13. Discrete-time graph-state MFG • : Discrete-time graph-state MFG • d • !" # : t i • $"% & : t, t+1 i j • (Mean) topic1 topic2 topic1 topic2 : !' # = 2 3 !+ # = 1 3 !' # + 1 = 7 9 !' # + 2 = 2 9 $',+ & = 1 6 , $+,' & = 2 3
  • 14. Discrete-time graph-state MFG • : Discrete-time graph-state MFG • !"($ % , '" % ): • $ % = $" % "*+ , '" - = '",+ - , … , '",, - i • !"($ % , ' % )= !"($ % , '" % ) (where '- = '+ - , … , ', - ) topic1 topic2 topic1 topic2 : !/($ % , '/ % ) $ % 2 '/ % 2 ( ⇄ )
  • 15. Discrete-time graph-state MFG • MFG • !" # = max () * [," -# , /" # + ∑2 /"2 # !2 #34 ] (backward Hamilton-Jacobi-Bellman equation, HJB) • -" #34 = ∑2 /2" # -2 # (forward Fokker-Planck equation) • !" 6 : t i ( ) • -7 , !8 , ," -# , /" # Dynamic Programing Trajectory -# , !# #97 8 • ," -# , /" # • ØHJB: Nash-Maximizer /" #
  • 16. Inference on MFG via MDP optimization • … MDP MFG Trajectory
  • 17. Inference on MFG via MDP optimization • MFG MDP • • MFG MDP Ø Ø MFG Forward-Path • Settings • States: !" , n • Actions: #" , n • Dynamics: !$ "%& = ∑) #)$ " !) " • Reward: * !" , #" = ∑$,& - !$ " ∑),& - #$) " .$)(!" , #$ " ),
  • 18. Inference on MFG via MDP optimization • : MDP MFG HJB, Fokker-Planck HJB Fokker-Planck ! Nash-Maximizer!"
  • 19. Inference on MFG via MDP optimization • 1. ( ⇄ ) 2. 3. • MFG MDP Øsingle-agent RL V∗ (%& ) = max , [. %& , 0 + V∗ %&23 ] Ø ⇄ ØMDP
  • 20. Experiments • : Twitter • d = 15 topics 15 ( ( , etc.) ) • n_timesteps=16, 16 1episode • n_episodes = 27, 27 • Guided Cost Learning (Finn+ 2016) • Forward-Path • • Deep • : Vector Autoregression(VAR), RNN
  • 22. Experiments • Jensen-Shannon-Divergence VAE, RNN • MFG ( ⇄ , ) • MFG RNN • RNN
  • 24. Conslusion • • MFG MDP • MFG Toy-Problem • • !"($ % , ' % )= !"($ % , '" % ) • ( ) • Network-Based Social Dynamics Model •
  • 26. References • Gueant, Olivier. (2009). A reference case for mean field games models. Journal de Mathématiques Pures et Appliquées. 92. 276-294. 10.1016/j.matpur.2009.04.008. • Guéant O., Lasry JM., Lions PL. (2011) Mean Field Games and Applications. In: Paris- Princeton Lectures on Mathematical Finance 2010. Lecture Notes in Mathematics, vol 2003. Springer, Berlin, Heidelberg • Chelsea Finn, Sergey Levine, and Pieter Abbeel. Guided cost learning: Deep inverse optimal control via policy optimization. In International Conference on Machine Learning, pp. 49–58, 2016. • Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, Jun Wang, Mean Field Multi-Agent Reinforcement Learning, 2018, arxiv
  • 27. • MFG : • https://link.springer.com/content/pdf/10.1007%2Fs11537-007-0657-8.pdf • MFG : • https://www.sciencedirect.com/science/article/pii/S002178240900138X • : • https://terrytao.wordpress.com/2010/01/07/mean-field-equations/ • The causal mechanism for such waves is somewhat strange, though, due to the presence of the backward propagating equation – in some sense, the wave continues to propagate because the audience members expect it to continue to propagate, and act accordingly. (One wonders if these sorts of equations could provide a model for things like asset price bubbles, which seem to be governed by a similar mechanism.)