Login / Signup
DORB: Dynamically Optimizing Multiple Rewards with Bandits.
Ramakanth Pasunuru
Han Guo
Mohit Bansal
Published in:
EMNLP (1) (2020)
Keyphrases
</>
data structure
multi armed bandits
data mining
search engine
website
reinforcement learning
dynamic programming