MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from Observations.

Anqi Li Byron Boots Ching-An Cheng

Published in: CoRR (2023)

Keyphrases

imitation learning
reinforcement learning
reinforcement learning methods
function approximation
state space
reinforcement learning algorithms
control problems
model free
learning algorithm
dynamic programming
multi agent
optimal policy
markov decision processes
supervised learning
transfer learning
real time
maximum margin
hidden state
learning problems
maximum likelihood
optimal control
learning process
dynamical systems
temporal difference
markov decision process