ODE-based Recurrent Model-free Reinforcement Learning for POMDPs.

Xuanle Zhao Duzhen Zhang Liyuan Han Tielin Zhang Bo Xu

Published in: NeurIPS (2023)

Keyphrases

model free reinforcement learning
policy gradient
reinforcement learning
partially observable markov decision processes
ordinary differential equations
function approximation
optimal control
reinforcement learning algorithms
gradient method
decision problems
belief state
partially observable
optimal policy
state space
heuristic search
markov decision processes
dynamic systems
neural network
finite state
multi agent
learning algorithm