Optimal policy for labeling training samples.
Lester LipskyDaniel P. LoprestiGeorge NagyPublished in: DRR (2013)
Keyphrases
- training samples
- optimal policy
- markov decision processes
- finite horizon
- reinforcement learning
- dynamic programming
- feature space
- test sample
- state space
- training data
- state dependent
- training set
- supervised learning
- infinite horizon
- learning algorithm
- active learning
- multistage
- sufficient conditions
- classification error
- long run
- bayesian reinforcement learning
- number of training samples
- hyperplane
- high dimensional
- average reward
- markov decision process
- training examples
- serial inventory systems
- representative samples
- machine learning
- kernel matrix
- convex hull
- inventory control
- face images
- support vector
- data mining
- data sets