Extending Sliding-Step Importance Weighting from Supervised Learning to Reinforcement Learning.
Tian TianRichard S. SuttonPublished in: IJCAI (2019)
Keyphrases
- reinforcement learning
- supervised learning
- learning algorithm
- learning problems
- function approximation
- temporal difference
- state space
- action selection
- unsupervised learning
- training data
- kernel based learning
- relative importance
- labeled data
- model free
- temporal difference learning
- multiple instance learning
- neural network
- markov decision processes
- sliding window
- training samples
- post processing
- semi supervised learning
- learning tasks
- optimal policy
- optimal control
- tf idf
- semi supervised
- multi class
- dynamic programming
- reinforcement learning algorithms
- training set
- data sets