Sign in

Improved Policy Optimization for Online Imitation Learning.

Jonathan Wilder LavingtonSharan VaswaniMark Schmidt
Published in: CoRR (2022)
Keyphrases
  • imitation learning
  • reinforcement learning
  • optimal policy
  • robotic systems
  • action selection
  • humanoid robot
  • maximum margin