Login / Signup
TBQ(σ): Improving Efficiency of Trace Utilization for Off-Policy Reinforcement Learning.
Longxiang Shi
Shijian Li
Longbing Cao
Long Yang
Gang Pan
Published in:
CoRR (2019)
Keyphrases
</>
reinforcement learning
function approximation
high efficiency
real time
machine learning
computational complexity
neural network
learning algorithm
data sets
information systems
case study
similarity measure
data structure
dynamic programming
computational efficiency
learning tasks