The Misbehavior of Reinforcement Learning.
Gianluigi MongilloHanan ShteingartYonatan LoewensteinPublished in: Proc. IEEE (2014)
Keyphrases
- reinforcement learning
- function approximation
- machine learning
- state space
- reinforcement learning algorithms
- dynamic programming
- markov decision processes
- temporal difference
- policy search
- multi agent
- learning algorithm
- optimal policy
- learning problems
- multi agent reinforcement learning
- direct policy search
- temporal difference learning
- function approximators
- markov decision process
- evolutionary learning
- database
- transition model
- objective function
- perceptual aliasing
- multiscale