Combining a Meta-Policy and Monte-Carlo Planning for Scalable Type-Based Reasoning in Partially Observable Environments.
Jonathon SchwartzHanna KurniawatiMarcus HutterPublished in: CoRR (2023)
Keyphrases
- monte carlo
- partially observable environments
- partially observable
- inverse reinforcement learning
- partially observable markov decision processes
- markov chain
- monte carlo simulation
- temporal difference
- reinforcement learning
- reinforcement learning algorithms
- monte carlo methods
- particle filter
- state space
- monte carlo tree search
- dynamical systems
- policy evaluation
- variance reduction
- importance sampling
- decision problems
- belief state
- markov decision processes
- optimal policy
- heuristic search
- domain independent
- partially observable markov decision process
- planning problems
- policy search