Login / Signup
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets.
Han Zhong
Wei Xiong
Jiyuan Tan
Liwei Wang
Tong Zhang
Zhaoran Wang
Zhuoran Yang
Published in:
ICML (2022)
Keyphrases
</>
learning process
worst case
database
learning algorithm
reinforcement learning
active learning
learning systems
learning problems
real time
neural network
training data
knowledge acquisition
heuristic search