Batch Reinforcement Learning from Crowds.
Guoxi ZhangHisashi KashimaPublished in: ECML/PKDD (4) (2022)
Keyphrases
- reinforcement learning
- batch mode
- state space
- function approximation
- policy search
- dynamic programming
- temporal difference
- database
- learning process
- markov decision processes
- machine learning
- crowd sourcing
- optimal control
- optimal policy
- model free
- supervised learning
- robot control
- markov decision process
- active learning
- temporal difference learning
- autonomous learning
- multi agent reinforcement learning
- artificial neural networks
- lower bound
- batch size
- batch learning
- direct policy search