DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction.
Aviral KumarAbhishek GuptaSergey LevinePublished in: NeurIPS (2020)
Keyphrases
- reinforcement learning
- data distribution
- markov decision process
- model free
- function approximation
- markov decision processes
- state space
- optimal policy
- probability distribution
- transfer learning
- learning process
- multi agent
- spatial distribution
- learning classifier systems
- decision trees
- reinforcement learning algorithms
- information systems