Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL.
Charles PackerPieter AbbeelJoseph E. GonzalezPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- function approximation
- compressed sensing
- policy gradient
- markov decision processes
- sparse data
- action selection
- state space
- meta level
- high dimensional
- sparse representation
- average reward
- dynamic programming
- learning agents
- discounted reward
- state action
- control policy
- learning agent
- neural network
- reinforcement learning algorithms
- model free
- long run
- learning classifier systems
- transfer learning
- sufficient conditions
- machine learning