Towards Instance-Optimality in Online PAC Reinforcement Learning.
Aymen Al MarjaniAndrea TirinzoniEmilie KaufmannPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- function approximation
- online learning
- learning algorithm
- markov decision processes
- real time
- dynamic programming
- pac learning
- upper bound
- learning process
- optimal control
- supervised learning
- theoretical analysis
- optimal policy
- learning problems
- model free
- action selection
- temporal difference
- statistical queries
- noise tolerant