Login / Signup
Leveraging the Variance of Return Sequences for Exploration Policy.
Zerong Xi
Gita Sukthankar
Published in:
CoRR (2020)
Keyphrases
</>
action selection
hidden markov models
optimal policy
covariance matrix
sequential patterns
machine learning
genetic algorithm
information systems
information technology
correlation coefficient
asymptotically optimal