Learning Pessimism for Robust and Efficient Off-Policy Reinforcement Learning.
Edoardo CetinOya ÇeliktutanPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- learning process
- learning algorithm
- learning problems
- learning systems
- temporal difference learning
- learning agents
- supervised learning
- computationally efficient
- online learning
- function approximation
- highly efficient
- unsupervised learning
- mobile learning
- learning capabilities
- prior knowledge
- training set