Human-AI Learning Performance in Multi-Armed Bandits.

Ravi Pandya Sandy H. Huang Dylan Hadfield-Menell Anca D. Dragan

Published in: CoRR (2018)

Keyphrases

multi armed bandits
artificial intelligence
learning process
learning algorithm
expert systems
upper bound
decision trees
special case
dynamic programming
online learning
information theoretic
intelligent behavior