Offline RL with Discrete Proxy Representations for Generalizability in POMDPs.
Pengjie GuXinyu CaiDong XingXinrun WangMengchen ZhaoBo AnPublished in: NeurIPS (2023)
Keyphrases
- reinforcement learning
- continuous state
- markov decision processes
- continuous state spaces
- partially observable markov decision processes
- state space
- continuous action
- continuous domains
- function approximation
- partially observable
- belief state
- real time
- model free
- policy search
- control policies
- action space
- optimal policy
- dynamic programming
- reinforcement learning algorithms
- policy gradient
- multi agent
- machine learning
- neural network
- learning classifier systems
- finite number
- dynamical systems
- function approximators
- learning process