Dynamic Regret of Adversarial Linear Mixture MDPs.

Long-Fei Li Peng Zhao Zhi-Hua Zhou

Published in: NeurIPS (2023)

Keyphrases

markov decision processes
dynamic environments
reinforcement learning
mixture model
closed form
multi agent
loss function
game theory
learning algorithm
support vector
multi agent systems
exponential family
regret bounds