Long-Term Visitation Value for Deep Exploration in Sparse Reward Reinforcement Learning.
Simone ParisiDavide TateoMaximilian HenselCarlo D'EramoJan PetersJoni PajarinenPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- long term
- exploration strategy
- short term
- exploration exploitation
- action selection
- reinforcement learning algorithms
- function approximation
- markov decision processes
- active exploration
- state space
- sparse data
- eligibility traces
- average reward
- model based reinforcement learning
- balancing exploration and exploitation
- high dimensional
- dynamic programming
- temporal difference
- multi agent
- partially observable environments
- optimal control
- model free
- learning algorithm
- compressive sensing
- inverse reinforcement learning
- autonomous learning
- reward function
- transfer learning
- neural network
- reward shaping
- policy search
- exploration exploitation tradeoff
- state action
- learning agent
- sparse coding
- supervised learning
- learning process
- function approximators
- compressed sensing
- markov decision process
- dynamical systems
- sufficient conditions
- robotic control
- active learning
- machine learning