Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage.
Masatoshi UeharaWen SunPublished in: ICLR (2022)
Keyphrases
- reinforcement learning
- model free
- real time
- function approximation
- reinforcement learning algorithms
- machine learning
- multi agent reinforcement learning
- markov decision processes
- supervised learning
- data driven
- dynamic programming
- temporal difference
- action selection
- multi agent
- optimal policy
- partial information
- partially observable
- markov decision process
- robotic control
- transfer learning
- learning process
- case study
- website
- learning algorithm
- data mining