Non-Stationary Markov Decision Processes a Worst-Case Approach using Model-Based Reinforcement Learning.
Erwan LecarpentierEmmanuel RachelsonPublished in: CoRR (2019)
Keyphrases
- model based reinforcement learning
- non stationary
- markov decision processes
- state space
- finite state
- optimal policy
- reinforcement learning
- dynamic programming
- lower bound
- reinforcement learning algorithms
- policy iteration
- infinite horizon
- average cost
- planning under uncertainty
- finite horizon
- partially observable
- decision processes
- markov decision process
- np hard
- least squares
- reward function
- action space
- planning problems
- machine learning