): near-optimal safety-constrained reinforcement learning in polynomial time.

David M. Bossens Nicholas Bishop

Published in: CoRR (2021)

Keyphrases

reinforcement learning
special case
deterministic domains
learning algorithm
function approximation
state space
worst case
approximation algorithms
temporal difference learning
optimal policy
supervised learning
model free
computational complexity
reinforcement learning methods
machine learning
real time
optimal control
action selection
reinforcement learning algorithms
stochastic approximation
multi agent reinforcement learning