Verified Probabilistic Policies for Deep Reinforcement Learning.
Edoardo BacciDavid ParkerPublished in: NFM (2022)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- control policies
- bayesian networks
- markov decision process
- reinforcement learning agents
- function approximation
- data driven
- hierarchical reinforcement learning
- posterior probability
- continuous state
- reward function
- temporal difference
- multiagent reinforcement learning
- uncertain data
- information theoretic
- generative model
- total reward
- state space
- action selection
- decentralized control
- fitted q iteration
- decision processes
- control policy
- partially observable markov decision processes
- neural network
- long run
- finite state
- markov decision processes
- graphical models
- machine learning