Towards Applicable Reinforcement Learning: Improving the Generalization and Sample Efficiency with Policy Ensemble.
Zhengyu YangKan RenXufang LuoMinghuan LiuWeiqing LiuJiang BianWeinan ZhangDongsheng LiPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- optimal policy
- learning algorithm
- policy search
- markov decision process
- state space
- function approximation
- training data
- neural network
- markov decision processes
- partially observable environments
- state and action spaces
- partially observable domains
- control policies
- action space
- infinite horizon
- ensemble methods
- reward function
- state dependent
- function approximators
- rl algorithms
- action selection
- model free
- training set