Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret.
Jiawei HuangLi ZhaoTao QinWei ChenNan JiangTie-Yan LiuPublished in: NeurIPS (2022)
Keyphrases
- reinforcement learning
- minimax regret
- human faces
- function approximation
- markov decision processes
- online learning
- partial observability
- lower bound
- face images
- reinforcement learning algorithms
- total reward
- decision problems
- loss function
- machine learning
- learning process
- facial expressions
- transfer learning
- state space
- optimal policy
- learning problems
- uncertain data
- optimal control
- expert advice
- belief functions
- learning algorithm
- neural network
- action space
- robust optimization
- partially observable
- multi agent
- binary classification
- facial images
- active learning