ODE-based Recurrent Model-free Reinforcement Learning for POMDPs.
Xuanle ZhaoDuzhen ZhangLiyuan HanTielin ZhangBo XuPublished in: NeurIPS (2023)
Keyphrases
- model free reinforcement learning
- policy gradient
- reinforcement learning
- partially observable markov decision processes
- ordinary differential equations
- function approximation
- optimal control
- reinforcement learning algorithms
- gradient method
- decision problems
- belief state
- partially observable
- optimal policy
- state space
- heuristic search
- markov decision processes
- dynamic systems
- neural network
- finite state
- multi agent
- learning algorithm