Login / Signup
Recommending the optimal policy by learning to act from temporal data.
Stefano Branchi
Andrei Buliga
Chiara Di Francescomarino
Chiara Ghidini
Francesca Meneghello
Massimiliano Ronzani
Published in:
CoRR (2023)
Keyphrases
</>
temporal data
optimal policy
reinforcement learning
learning algorithm
average reward reinforcement learning
markov decision processes
finite horizon
state dependent
data sets
data mining
dynamic programming
infinite horizon