Login / Signup

Contextual Bandit Learning With Reward Oracles and Sampling Guidance in Multi-Agent Environments.

Mike LiQuang Dang Nguyen
Published in: IEEE Access (2021)
Keyphrases
  • reinforcement learning
  • learning algorithm
  • multi agent environments
  • learning process
  • active learning
  • machine learning
  • supervised learning
  • autonomous agents
  • action selection