Harnessing Structures for Value-Based Planning and Reinforcement Learning.
Yuzhe YangGuo ZhangZhi XuDina KatabiPublished in: ICLR (2020)
Keyphrases
- reinforcement learning
- action selection
- function approximation
- planning problems
- stochastic domains
- goal oriented
- learning algorithm
- markov decision processes
- temporal difference
- decision theoretic
- domain independent
- optimal policy
- state space
- model free
- multi agent reinforcement learning
- macro actions
- reinforcement learning methods
- temporal difference learning
- deterministic domains
- partially observable markov decision processes
- reinforcement learning algorithms
- ai planning
- transfer learning
- heuristic search
- multi agent
- decision making