Login / Signup
GO-DICE: Goal-Conditioned Option-Aware Offline Imitation Learning via Stationary Distribution Correction Estimation.
Abhinav Jain
Vaibhav V. Unhelkar
Published in:
CoRR (2023)
Keyphrases
</>
stationary distribution
imitation learning
markov chain
random walk
queueing networks
initial state
transition probabilities
queue length
parameter estimation
service times
reinforcement learning
maximum entropy