Login / Signup
Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning.
Jingfeng Wu
Vladimir Braverman
Lin Yang
Published in:
NeurIPS (2021)
Keyphrases
</>
multi objective
reinforcement learning
evolutionary algorithm
genetic algorithm
worst case
online learning
optimal policy
regret bounds
state space