Towards Applicable Reinforcement Learning: Improving the Generalization and Sample Efficiency with Policy Ensemble.
Zhengyu YangKan RenXufang LuoMinghuan LiuWeiqing LiuJiang BianWeinan ZhangDongsheng LiPublished in: IJCAI (2022)
Keyphrases
- feature selection
- reinforcement learning
- optimal policy
- machine learning
- policy search
- markov decision process
- function approximation
- ensemble learning
- action selection
- multi class
- markov decision processes
- policy gradient
- function approximators
- neural network
- reinforcement learning algorithms
- learning process
- model free
- decision problems
- learning algorithm
- partially observable domains
- temporal difference
- partially observable markov decision processes
- average reward
- policy evaluation
- actor critic
- reinforcement learning problems
- partially observable environments