RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning.
Marc RigterBruno LacerdaNick HawesPublished in: NeurIPS (2022)
Keyphrases
- reinforcement learning
- model free
- multi agent
- function approximation
- reinforcement learning algorithms
- markov decision processes
- state space
- rl algorithms
- temporal difference
- continuous state
- autonomous learning
- learning agents
- computationally efficient
- dynamic programming
- direct policy search
- multi agent reinforcement learning
- temporal difference learning
- real time
- control problems
- partially observable
- optimal policy
- control policy
- action selection
- real valued
- partially observable domains
- machine learning