Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL.
Charles PackerPieter AbbeelJoseph E. GonzalezPublished in: NeurIPS (2021)
Keyphrases
- reinforcement learning
- sparse data
- function approximation
- partially observable domains
- average reward
- markov decision processes
- model free
- inverse reinforcement learning
- policy gradient
- learning agent
- compressive sensing
- action selection
- long run
- optimal policy
- reward function
- sparse coding
- reinforcement learning algorithms
- user experience
- sufficient conditions
- state space
- active learning
- eligibility traces
- high dimensional