No-Regret Shannon Entropy Regularized Neural Contextual Bandit Online Learning for Robotic Grasping.
Kyungjae LeeJaegu ChoyYunho ChoiHogun KeeSonghwai OhPublished in: IROS (2020)
Keyphrases
- online learning
- shannon entropy
- contextual bandit
- upper confidence bound
- object manipulation
- manipulation tasks
- information theory
- mutual information
- news recommendation
- kl divergence
- e learning
- least squares
- information theoretic
- online convex optimization
- machine learning
- mahalanobis distance
- model selection
- feature selection