MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from Observations.
Anqi LiByron BootsChing-An ChengPublished in: CoRR (2023)
Keyphrases
- imitation learning
- reinforcement learning
- reinforcement learning methods
- function approximation
- state space
- reinforcement learning algorithms
- control problems
- model free
- learning algorithm
- dynamic programming
- multi agent
- optimal policy
- markov decision processes
- supervised learning
- transfer learning
- real time
- maximum margin
- hidden state
- learning problems
- maximum likelihood
- optimal control
- learning process
- dynamical systems
- temporal difference
- markov decision process