Mitigating Disparity while Maximizing Reward: Tight Anytime Guarantee for Improving Bandits.

Vishakha Patil Vineet Nair Ganesh Ghalme Arindam Khan

Published in: CoRR (2022)

Keyphrases

lower bound
reinforcement learning
upper bound
multi armed bandit
stereo images
motion estimation
disparity estimation
e learning
stereo correspondence