True Online TD(λ)-Replay An Efficient Model-free Planning with Full Replay.
Abdulrahman AltahhanPublished in: IJCNN (2020)
Keyphrases
- model free
- temporal difference
- reinforcement learning algorithms
- reinforcement learning
- function approximation
- policy evaluation
- policy iteration
- action selection
- temporal difference learning
- impedance control
- training data
- planning problems
- data mining
- evaluation function
- fixed point
- monte carlo
- active learning
- feature selection
- genetic algorithm