Predictable MDP Abstraction for Unsupervised Model-Based RL.
Seohong ParkSergey LevinePublished in: ICML (2023)
Keyphrases
- markov chain
- state space
- markov decision processes
- finite state
- reinforcement learning
- markov decision process
- optimal policy
- decision theoretic planning
- state abstraction
- model free
- fully unsupervised
- average reward
- action space
- data driven
- reinforcement learning algorithms
- policy iteration
- high level abstractions
- reward function
- supervised learning
- state and action spaces
- utility function
- heuristic search
- admissible heuristics
- high level
- markov decision problems
- partially observable
- function approximation
- unsupervised learning
- semi supervised
- search algorithm
- exploration strategy
- discounted reward
- unsupervised manner
- learning agent
- real valued
- average cost
- total reward
- bayesian reinforcement learning