Risk-Variant Policy Switching to Exceed Reward Thresholds.
Breelyn Melissa KaneReid G. SimmonsPublished in: ICAPS (2012)
Keyphrases
- expected reward
- reward function
- partially observable environments
- average reward
- optimal policy
- inverse reinforcement learning
- policy gradient
- reinforcement learning
- control policy
- risk management
- risk factors
- decision making
- opportunity cost
- risk assessment
- markov decision processes
- long run
- high risk
- risk analysis
- finite horizon
- upper bound
- asymptotically optimal
- state action
- total reward
- agent receives
- risk measures
- partially observable
- minimum risk
- policy evaluation
- policy making
- risk averse
- learning agent