Jointly Pre-training with Supervised, Autoencoder, and Value Losses for Deep Reinforcement Learning.
Gabriel Victor de la CruzYunshu DuMatthew E. TaylorPublished in: CoRR (2019)
Keyphrases
- reinforcement learning
- supervised learning
- restricted boltzmann machine
- learning algorithm
- deep learning
- machine learning
- training set
- deep belief networks
- feedforward neural networks
- semi supervised
- training stage
- multiple instance learning
- optimal policy
- supervised training
- model free
- temporal difference
- training algorithm
- function approximation
- unsupervised learning
- dynamic programming
- learning process
- reinforcement learning algorithms
- deep architectures
- training phase
- optimal control
- learning problems
- active learning
- training samples
- online learning
- least squares
- supervised methods
- training data
- data sets