An Adaptive State Aggregation Algorithm for Markov Decision Processes.
Guanting ChenJohann Demetrio GaeblerMatt PengChunlin SunYinyu YePublished in: CoRR (2021)
Keyphrases
- markov decision processes
- dynamic programming
- state space
- policy iteration
- computational complexity
- average reward
- model based reinforcement learning
- state abstraction
- linear programming
- state variables
- monte carlo
- real time dynamic programming
- model free
- finite state
- learning algorithm
- least squares
- np hard
- search space
- objective function