Login / Signup
Offline RL Policies Should be Trained to be Adaptive.
Dibya Ghosh
Anurag Ajay
Pulkit Agrawal
Sergey Levine
Published in:
CoRR (2022)
Keyphrases
</>
optimal policy
reinforcement learning
markov decision processes
state space
control policies
control policy
markov decision process
adaptive control
dynamic programming
training set
artificial neural networks
temporal difference
reinforcement learning algorithms
multi agent
neural network
learning agents
real time