Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz Dynamic Risk Measures.
Hao LiangZhiquan LuoPublished in: AISTATS (2024)
Keyphrases
- risk sensitive
- risk measures
- reinforcement learning
- optimal control
- model free
- markov decision processes
- utility function
- markov decision problems
- function approximation
- risk averse
- control policies
- decision theoretic
- robust optimization
- optimal policy
- learning algorithm
- partially observable
- machine learning
- markov decision process
- policy iteration
- action space
- online learning
- supervised learning
- multi objective