Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation.
Dylan J. FosterAkshay KrishnamurthyDavid Simchi-LeviYunzong XuPublished in: COLT (2022)
Keyphrases
- reinforcement learning
- temporal difference
- state space
- temporal difference learning
- function approximation
- approximate dynamic programming
- function approximators
- real time
- model free
- state action
- reinforcement learning algorithms
- action selection
- basis functions
- markov games
- markov decision processes
- optimal policy
- optimal control
- linear combination
- supervised learning
- partially observable
- control problems
- robot control
- dynamic programming
- active learning
- machine learning
- reinforcement learning methods
- stochastic approximation
- neural network
- database