Concave Utility Reinforcement Learning with Zero-Constraint Violations.
Mridul AgarwalQinbo BaiVaneet AggarwalPublished in: CoRR (2021)
Keyphrases
- constraint violations
- reinforcement learning
- hard constraints
- soft constraints
- temporal constraints
- state space
- multiple criteria
- utility function
- objective function
- optimal policy
- learning algorithm
- machine learning
- decision problems
- combinatorial optimization
- spatial information
- graphical models
- lower bound
- video sequences