Provably Good Batch Reinforcement Learning Without Great Exploration.
Yao LiuAdith SwaminathanAlekh AgarwalEmma BrunskillPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- active exploration
- exploration strategy
- action selection
- model based reinforcement learning
- batch mode
- exploration exploitation
- autonomous learning
- function approximation
- state space
- reinforcement learning algorithms
- markov decision processes
- machine learning
- optimal policy
- balancing exploration and exploitation
- batch learning
- exploration exploitation tradeoff
- worst case
- temporal difference
- batch size
- learning algorithm
- batch processing
- temporal difference learning
- model free
- incremental learning
- dynamic programming
- objective function