Adaptive Batch Size for Safe Policy Gradients.

Matteo Papini Matteo Pirotta Marcello Restelli

Published in: NIPS (2017)

Keyphrases

batch size
batch mode
finite horizon
order quantity
infinite horizon
single item
poisson process
e learning
pairwise
standard deviation
expected cost