Login / Signup
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets.
Han Zhong
Wei Xiong
Jiyuan Tan
Liwei Wang
Tong Zhang
Zhaoran Wang
Zhuoran Yang
Published in:
CoRR (2022)
Keyphrases
</>
learning process
learning systems
database
benchmark datasets
learning algorithm
reinforcement learning
search algorithm
supervised learning
online learning
active learning
learning tasks
decision theoretic