Efficient exploration through active learning for value function approximation in reinforcement learning.
Takayuki AkiyamaHirotaka HachiyaMasashi SugiyamaPublished in: Neural Networks (2010)
Keyphrases
- active learning
- reinforcement learning
- temporal difference
- temporal difference learning
- state space
- approximate dynamic programming
- learning algorithm
- transfer learning
- state action
- supervised learning
- exploration exploitation
- machine learning
- function approximation
- learning process
- reinforcement learning algorithms
- model free
- semi supervised
- learning strategies
- training examples
- markov games
- optimal policy
- basis functions
- active exploration
- random sampling
- selective sampling
- labeled data
- function approximators
- learning problems
- markov decision processes
- multi agent
- batch mode
- fixed point
- imbalanced data classification
- training set
- robotic control
- action selection
- semi supervised learning
- learning classifier systems
- monte carlo
- policy gradient
- active learning strategies
- evaluation function
- action space
- markov decision process
- game playing
- cost sensitive
- data sets
- mobile robot
- dynamic programming
- neural network