Login / Signup
From External to Swap Regret 2.0: An Efficient Reduction and Oblivious Adversary for Large Action Spaces.
Yuval Dagan
Constantinos Daskalakis
Maxwell Fishelson
Noah Golowich
Published in:
CoRR (2023)
Keyphrases
</>
action space
state space
markov decision processes
lower bound
online learning
skill learning
reinforcement learning
dynamic programming
least squares
graph cuts
decision making
non stationary
reward function
markov decision process
state and action spaces