MICo: Learning improved representations via sampling-based state similarity for Markov decision processes.
Pablo Samuel CastroTyler KastnerPrakash PanangadenMark RowlandPublished in: CoRR (2021)
Keyphrases
- markov decision processes
- reinforcement learning
- state space
- real time dynamic programming
- stochastic games
- model based reinforcement learning
- partially observable
- state abstraction
- learning tasks
- optimal policy
- finite state
- state variables
- infinite horizon
- markov decision process
- factored mdps
- supervised learning
- reinforcement learning algorithms
- continuous state spaces
- planning under uncertainty
- average reward
- markov chain
- dynamic programming