Login / Signup
Where's the Reward?
Shayan Doroudi
Vincent Aleven
Emma Brunskill
Published in:
Int. J. Artif. Intell. Educ. (2019)
Keyphrases
</>
reinforcement learning
long run
data sets
three dimensional
partially observable environments
neural network
machine learning
natural language
reinforcement learning algorithms
average reward
bandit problems