Planning in Reward-Rich Domains via PAC Bandits.
Sergiu GoschinAri WeinsteinMichael L. LittmanErick ChastainPublished in: EWRL (2012)
Keyphrases
- blocks world
- real world
- complex domains
- reinforcement learning
- deterministic domains
- planning systems
- heuristic search
- domain independent planning
- planning problems
- causal graph
- upper bound
- multi armed bandit
- theoretical analysis
- stochastic domains
- continuous domains
- agent receives
- plan generation
- multi armed bandits
- ai planning
- cross domain
- application domains
- domain independent
- transfer learning
- sample size
- decision support
- multi class
- search space
- multi agent
- high level