• search
    search
  • reviewers
    reviewers
  • feeds
    feeds
  • assignments
    assignments
  • settings
  • logout

An Adaptive Policy Evaluation Network Based on Recursive Least Squares Temporal Difference With Gradient Correction.

Dazi LiYuting WangTianheng SongQibing Jin
Published in: IEEE Access (2018)
Keyphrases