Fast Reinforcement Learning with Large Action Sets using Error-Correcting Output Codes for MDP Factorization
Gabriel Dulac-ArnoldLudovic DenoyerPhilippe PreuxPatrick GallinariPublished in: CoRR (2012)
Keyphrases
- action sets
- reinforcement learning
- error correcting output codes
- multi class
- learning problems
- markov decision processes
- state space
- function approximation
- pairwise
- binary classifiers
- model free
- optimal policy
- machine learning
- learning algorithm
- multiclass problems
- temporal difference
- base classifiers
- finite state
- markov decision process
- transfer learning
- optimal control
- multi agent
- training data
- learning tasks
- binary classification
- markov chain
- partially observable
- average cost
- support vector machine
- learning process