Login / Signup

Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage.

Somayeh MoazeniWarren R. ScottWarren B. Powell
Published in: INFOR Inf. Syst. Oper. Res. (2020)
Keyphrases
  • direct policy search
  • dynamic programming
  • reinforcement learning
  • neural network
  • evolutionary algorithm
  • temporal difference
  • optimal solution