Login / Signup
On-Policy Trust Region Policy Optimisation with Replay Buffers.
Dmitry Kangin
Nicolas Pugeault
Published in:
CoRR (2019)
Keyphrases
</>
trust region
genetic algorithm
np hard
combinatorial optimization
constraint programming