Implicit and Explicit Policy Constraints for Offline Reinforcement Learning.
Yang LiuMarius HofertPublished in: CLeaR (2024)
Keyphrases
- state dependent
- optimal policy
- reinforcement learning
- state space
- decision problems
- markov decision processes
- dynamic programming
- asymptotically optimal
- infinite horizon
- policy iteration
- constrained optimization
- reinforcement learning algorithms
- markov decision problems
- global constraints
- function approximation
- linear constraints
- constraint satisfaction
- policy search
- model free
- action selection
- partially observable
- supervised learning
- explicit or implicit