Publication: Hedging using reinforcement learning: Contextual k-Armed Bandit versus Q-learning.