A Surprisingly Simple Continuous-Action POMDP Solver: Lazy Cross-Entropy Search Over Policy Trees.
Marcus HörgerHanna KurniawatiDirk P. KroeseNan YePublished in: CoRR (2023)
Keyphrases
- cross entropy
- partially observable markov decision processes
- policy search
- continuous action
- optimal policy
- search space
- continuous state
- maximum likelihood
- finite state
- log likelihood
- partially observable
- probabilistic model
- reinforcement learning
- information retrieval
- decision problems
- markov decision process
- dynamical systems
- web search