): near-optimal safety-constrained reinforcement learning in polynomial time.
David M. BossensNicholas BishopPublished in: Mach. Learn. (2023)
Keyphrases
- reinforcement learning
- special case
- computational complexity
- function approximation
- reinforcement learning algorithms
- deterministic domains
- robotic control
- approximation algorithms
- learning algorithm
- model free
- markov decision processes
- temporal difference
- multi agent
- dynamic programming
- state space
- multi agent reinforcement learning
- np hardness
- partial observability
- transition model