Optimal Policies for Convex Symmetric Stochastic Dynamic Teams and their Mean-Field Limit.
Sina SanjariSerdar YükselPublished in: SIAM J. Control. Optim. (2021)
Keyphrases
- optimal policy
- stochastic dynamic
- markov decision processes
- decision problems
- dynamic programming
- reinforcement learning
- state space
- finite state
- finite horizon
- average reward
- state dependent
- infinite horizon
- markov random field
- multistage
- dynamic programming algorithms
- long run
- sufficient conditions
- average reward reinforcement learning
- multi agent
- average cost
- free energy
- bayesian reinforcement learning
- markov decision process
- policy iteration
- control policies
- expected reward
- total reward
- initial state
- serial inventory systems
- markov decision problems
- lost sales
- cost function