Reward-Free Exploration for Reinforcement Learning.
Chi JinAkshay KrishnamurthyMax SimchowitzTiancheng YuPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- exploration strategy
- exploration exploitation
- action selection
- function approximation
- bandit problems
- active exploration
- reward function
- model free
- eligibility traces
- state space
- reinforcement learning algorithms
- learning algorithm
- multi agent
- balancing exploration and exploitation
- markov decision processes
- model based reinforcement learning
- optimal policy
- exploration exploitation tradeoff
- temporal difference
- transfer learning
- machine learning
- state action
- partially observable environments
- supervised learning
- learning agent
- optimal control
- policy gradient
- learning classifier systems
- mobile robot
- average reward
- markov decision process
- partially observable
- learning problems