Fast Reinforcement Learning with Large Action Sets Using Error-Correcting Output Codes for MDP Factorization.
Gabriel Dulac-ArnoldLudovic DenoyerPhilippe PreuxPatrick GallinariPublished in: ECML/PKDD (2) (2012)
Keyphrases
- average cost
- action sets
- error correcting output codes
- markov decision processes
- reinforcement learning
- multi class
- optimal policy
- finite state
- learning problems
- markov decision process
- stationary policies
- state space
- binary classifiers
- pairwise
- function approximation
- markov decision problems
- policy iteration
- dynamic programming
- multiclass problems
- base classifiers
- partially observable
- supervised learning
- multi class problems
- binary classification
- machine learning
- cost sensitive
- support vector machine
- learning process
- feature selection
- learning algorithm