Almost surely safe exploration and exploitation for deep reinforcement learning with state safety estimation.
Ke LinYanjie LiQi LiuDuantengchuan LiXiongtao ShiShiyu ChenPublished in: Inf. Sci. (2024)
Keyphrases
- reinforcement learning
- state space
- state estimation
- autonomous learning
- active exploration
- function approximation
- neural network
- action selection
- semi parametric
- estimation algorithm
- exploration exploitation tradeoff
- exploration strategy
- transition model
- state action
- dynamic programming
- learning process
- learning algorithm
- genetic algorithm
- machine learning