Pruning for Monte Carlo Distributed Reinforcement Learning in Decentralized POMDPs.
Bikramjit BanerjeePublished in: AAAI (2013)
Keyphrases
- monte carlo
- reinforcement learning
- multi agent
- distributed constraint optimization
- distributed systems
- temporal difference
- stochastic approximation
- state space
- peer to peer
- markov chain
- policy evaluation
- importance sampling
- dec pomdps
- continuous state
- reinforcement learning algorithms
- monte carlo methods
- markov decision processes
- monte carlo simulation
- optimal policy
- monte carlo tree search
- simulation study
- machine learning
- partially observable markov decision processes
- model free
- partially observable
- search space
- temporal difference learning
- particle filter
- adaptive sampling
- policy search
- policy iteration
- function approximation
- finite state
- matrix inversion
- monte carlo method
- variance reduction
- single agent
- markovian decision
- control problems
- global illumination
- partially observable markov decision process
- markov decision problems
- game tree
- least squares
- learning algorithm