C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Offline Reinforcement Learning with Realizability and Single-policy Concentrability.
Wenhao Zhan
Baihe Huang
Audrey Huang
Nan Jiang
Jason D. Lee
Published in:
CoRR (2022)
Keyphrases
</>
reinforcement learning
optimal policy
policy search
action selection
neural network
machine learning
real time
markov decision processes
partially observable environments
dynamic programming
temporal difference
partially observable
control policy
policy gradient