Conservative Offline Distributional Reinforcement Learning.
Yecheng Jason MaDinesh JayaramanOsbert BastaniPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- state space
- optimal policy
- temporal difference
- markov decision processes
- real time
- multi agent
- learning process
- machine learning
- multi agent reinforcement learning
- reinforcement learning methods
- transfer learning
- learning problems
- action selection
- co occurrence
- learning capabilities
- markov decision process
- learning agent
- bayesian networks
- decision making
- direct policy search