Off-Policy Deep Reinforcement Learning without Exploration.
Scott FujimotoDavid MegerDoina PrecupPublished in: CoRR (2018)
Keyphrases
- reinforcement learning
- active exploration
- action selection
- exploration strategy
- model based reinforcement learning
- exploration exploitation
- function approximation
- reinforcement learning algorithms
- autonomous learning
- state space
- markov decision processes
- model free
- exploration exploitation tradeoff
- multi agent reinforcement learning
- data sets
- dynamic programming
- multi agent
- case study
- machine learning
- learning algorithm
- temporal difference learning
- neural network
- robotic control
- information retrieval
- policy search
- transition model
- information systems
- supervised learning
- function approximators
- learning capabilities
- decision trees
- probabilistic model
- transfer learning
- optimal policy