Off-Policy Deep Reinforcement Learning without Exploration.
Scott FujimotoDavid MegerDoina PrecupPublished in: ICML (2019)
Keyphrases
- reinforcement learning
- active exploration
- exploration strategy
- action selection
- model based reinforcement learning
- exploration exploitation
- autonomous learning
- exploration exploitation tradeoff
- state space
- markov decision processes
- function approximation
- learning algorithm
- multi agent
- active learning
- model free
- deep learning
- data sets
- learning process
- genetic algorithm
- machine learning
- transfer learning
- temporal difference
- reinforcement learning algorithms
- partially observable
- robot control
- optimal policy
- control policy
- unknown environments
- sufficient conditions
- multi agent reinforcement learning
- decision trees
- information systems