When is Realizability Sufficient for Off-Policy Reinforcement Learning?
Andrea ZanettePublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- function approximation
- model free
- multi agent
- state space
- machine learning
- learning algorithm
- reinforcement learning algorithms
- markov decision processes
- optimal policy
- direct policy search
- learning agents
- learning capabilities
- optimal control
- supervised learning
- real world
- database
- dynamical systems
- markov chain
- learning problems
- dynamic programming
- learning process
- search algorithm
- robot control
- neural network
- policy search
- perceptual aliasing
- robotic control
- real time