Login / Signup
A reinforcement learning framework based on regret minimization for approximating best response in fictitious self-play.
Yanran Xu
Kangxin He
Shu Hu
Hui Li
Published in:
HPCC/DSS/SmartCity/DependSys (2022)
Keyphrases
</>
reinforcement learning
main contribution
state space
theoretical framework
regret minimization
decision making
lower bound
learning process
probabilistic model
optimal policy