A Tighter Problem-Dependent Regret Bound for Risk-Sensitive Reinforcement Learning.
Xiaoyan HuHo-fung LeungPublished in: AISTATS (2023)
Keyphrases
- risk sensitive
- reinforcement learning
- model free
- optimal control
- markov decision processes
- regret bounds
- upper bound
- lower bound
- control policies
- markov decision problems
- reinforcement learning algorithms
- utility function
- function approximation
- state space
- dynamic programming
- finite state
- online learning
- temporal difference
- optimal policy
- policy iteration
- partially observable
- average cost
- markov decision process
- linear regression
- infinite horizon
- transfer learning
- decision makers
- decision theoretic
- action space
- worst case
- supervised learning