PAS: Probably Approximate Safety Verification of Reinforcement Learning Policy Using Scenario Optimization.
Arambam James SinghArvind EaswaranPublished in: AAMAS (2024)
Keyphrases
- reinforcement learning
- policy evaluation
- optimal policy
- policy search
- action selection
- markov decision process
- markov decision processes
- approximate policy iteration
- state and action spaces
- temporal difference
- function approximation
- optimization process
- monte carlo
- reinforcement learning algorithms
- global optimization
- model free
- optimization algorithm
- policy iteration
- real world
- partially observable environments
- learning algorithm
- reinforcement learning problems
- dynamic programming
- state space
- control policy
- function approximators
- action space
- partially observable
- decision problems
- control policies
- optimization problems
- constrained optimization
- approximate dynamic programming
- actor critic
- optimization model
- model checking