Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs.
Weichao MaoKaiqing ZhangRuihao ZhuDavid Simchi-LeviTamer BasarPublished in: ICML (2021)
Keyphrases
- non stationary
- model free reinforcement learning
- reinforcement learning
- markov decision processes
- finite horizon
- policy gradient
- state space
- optimal policy
- function approximation
- reinforcement learning algorithms
- random fields
- adaptive algorithms
- policy iteration
- machine learning
- partially observable
- partially observable markov decision processes
- average reward
- dynamic programming
- action space
- learning algorithm
- markov decision process
- initial state
- multi agent
- average cost
- reward function
- temporal difference
- model free
- infinite horizon
- empirical mode decomposition
- stochastic games
- finite state
- optimal control