Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion.
Taehyun ChoSeungyub HanHeesoo LeeKyungjae LeeJungwoo LeePublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- function approximation
- temporal difference
- decision making
- model free
- state space
- markov decision processes
- risk assessment
- risk management
- machine learning
- high risk
- reinforcement learning algorithms
- temporal difference learning
- transfer learning
- markov random field
- co occurrence
- learning process
- multi agent
- optimization criterion
- risk analysis
- risk measures
- early warning
- minimum risk
- expected utility
- optimal policy
- learning environment
- feature selection
- information systems
- learning algorithm