Deterministic Policies for Constrained Reinforcement Learning in Polynomial-Time.
Jeremy McMahanPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- deterministic domains
- optimal policy
- policy search
- markov decision process
- planning problems
- control policies
- reward function
- stochastic domains
- state space
- special case
- markov decision processes
- fitted q iteration
- function approximation
- symbolic model checking
- reinforcement learning algorithms
- partially observable markov decision processes
- computational complexity
- hierarchical reinforcement learning
- reinforcement learning agents
- decision problems
- model free
- worst case
- decomposable negation normal form
- stationary policies
- partially observable domains
- markov decision problems
- dynamic programming
- partially observable
- policy gradient methods
- function approximators
- reinforcement learning methods
- randomized algorithm
- supervised learning
- lower bound
- continuous state
- multi agent reinforcement learning
- linear programming
- control policy
- policy iteration
- approximation algorithms
- long run
- infinite horizon