)-optimal policy for the online selection of a monotone subsequence from a random sample.

Alessandro Arlotto Yehua Wei Xinchang Xie

Published in: Random Struct. Algorithms (2018)

Keyphrases

optimal policy
random sample
markov decision processes
finite horizon
state space
reinforcement learning
infinite horizon
random sampling
dynamic programming
state dependent
long run
sample size
sufficient conditions
markov decision process
average reward
multistage
lost sales
boolean functions
upper bound
version space
machine learning