Risk-sensitive Actor-free Policy via Convex Optimisation.
Ruoqi ZhangJens SjölundPublished in: AISafety/SafeRL@IJCAI (2023)
Keyphrases
- risk sensitive
- convex optimisation
- control policies
- markov decision problems
- markov decision processes
- optimal control
- optimal policy
- average cost
- utility function
- policy iteration
- state space
- optimality criterion
- infinite horizon
- reinforcement learning
- control policy
- average reward
- partially observable
- linear programming
- markov decision process
- finite horizon
- model free
- dynamic programming
- control system
- action space
- decision processes
- machine learning
- decision theoretic
- learning tasks
- decision problems
- semi definite programming
- learning algorithm