Login / Signup
Optimal Sensing via Multi-armed Bandit Relaxations in Mixed Observability Domains.
Mikko Lauri
Risto Ritala
Published in:
CoRR (2016)
Keyphrases
</>
multi armed bandit
multi armed bandits
lower bound
dynamic programming
np hard
optimal solution
decision trees
loss function