Safe reinforcement learning in high-risk tasks through policy improvement.
Francisco Javier García-PoloFernando Fernández RebolloPublished in: ADPRL (2011)
Keyphrases
- high risk
- reinforcement learning
- optimal policy
- policy search
- markov decision process
- risk factors
- function approximation
- prostate cancer
- learning algorithm
- function approximators
- markov decision processes
- machine learning
- action selection
- state space
- neural network
- transfer learning
- reinforcement learning algorithms
- partially observable
- temporal difference
- policy iteration
- model free
- databases
- pattern recognition
- multi class