UBEV - A More Practical Algorithm for Episodic RL with Near-Optimal PAC and Regret Guarantees.
Christoph DannTor LattimoreEmma BrunskillPublished in: CoRR (2017)
Keyphrases
- learning algorithm
- dynamic programming
- worst case
- objective function
- detection algorithm
- preprocessing
- segmentation algorithm
- theoretical guarantees
- model free
- particle swarm optimization
- probabilistic model
- active learning
- search space
- machine learning
- least squares
- reinforcement learning
- simulated annealing
- np hard
- theoretical analysis
- cost function
- k means