Login / Signup

Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits.

Hongju ParkMohamad Kazem Shirani Faradonbeh
Published in: IEEE Control. Syst. Lett. (2022)
Keyphrases
  • partially observable
  • reinforcement learning
  • state space
  • markov chain
  • decision problems
  • online learning
  • markov decision processes