Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret.
Yingjie FeiZhuoran YangYudong ChenZhaoran WangQiaomin XiePublished in: NeurIPS (2020)
Keyphrases
- risk sensitive
- reinforcement learning
- model free
- optimal control
- markov decision processes
- control policies
- reinforcement learning algorithms
- markov decision problems
- risk neutral
- reward function
- state space
- utility function
- function approximation
- dynamic programming
- optimal policy
- policy iteration
- control strategy
- lower bound
- temporal difference
- markov decision process
- partially observable
- multi agent
- infinite horizon
- computational complexity
- control system
- expected utility
- finite state
- function approximators
- machine learning
- probability distribution
- multi objective