Login / Signup
Contextual bandits with concave rewards, and an application to fair ranking.
Virginie Do
Elvis Dohmatob
Matteo Pirotta
Alessandro Lazaric
Nicolas Usunier
Published in:
CoRR (2022)
Keyphrases
</>
ranked list
multi armed bandits
contextual information
rank aggregation
reinforcement learning
evaluation measures
objective function
rank order
bandit problems
markov decision processes
piecewise linear
ranking algorithm
ranking functions
user feedback