Sample-Efficient Reinforcement Learning of Undercomplete POMDPs.
Chi JinSham M. KakadeAkshay KrishnamurthyQinghua LiuPublished in: NeurIPS (2020)
Keyphrases
- reinforcement learning
- policy search
- partially observable markov decision processes
- markov decision processes
- multi agent
- data sets
- function approximation
- state space
- cost effective
- partially observable
- temporal difference
- model free
- computationally expensive
- optimal policy
- active learning
- learning algorithm
- machine learning