Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation.
Dylan J. FosterAkshay KrishnamurthyDavid Simchi-LeviYunzong XuPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- temporal difference
- state space
- temporal difference learning
- approximate dynamic programming
- function approximation
- state action
- reinforcement learning algorithms
- function approximators
- evaluation function
- markov games
- basis functions
- optimal policy
- real time
- model free
- learning algorithm
- case study
- database
- multi agent reinforcement learning
- decision making
- dynamic programming
- monte carlo
- markov decision processes
- fixed point
- multi agent
- policy iteration
- linear program
- transfer learning
- continuous state
- policy gradient
- machine learning
- learning problems
- transition model