Login / Signup

Where's the Reward?

Shayan DoroudiVincent AlevenEmma Brunskill
Published in: Int. J. Artif. Intell. Educ. (2019)
Keyphrases
  • reinforcement learning
  • long run
  • data sets
  • three dimensional
  • partially observable environments
  • neural network
  • machine learning
  • natural language
  • reinforcement learning algorithms
  • average reward
  • bandit problems