Login / Signup
Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning.
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
Published in:
CoRR (2020)
Keyphrases
</>
multi objective
reinforcement learning
computational complexity
evolutionary algorithm
action selection
state space
worst case
online learning
markov decision processes
genetic algorithm
linear predictors
regret bounds