Nested Rollout Policy Adaptation with Selective Policies.
Tristan CazenavePublished in: CGW@IJCAI (2016)
Keyphrases
- optimal policy
- approximate policy iteration
- reinforcement learning
- policy search
- policy iteration
- markov decision process
- markov decision problems
- control policies
- access control policies
- management policies
- revenue management
- transport systems
- temporal difference
- markov decision processes
- state space
- control policy
- allocation policy
- allocation policies
- access control
- policy gradient methods
- decision problems
- reward function
- partially observable markov decision processes
- decision processes
- finite state
- asymptotically optimal
- privacy policies
- average reward
- model free
- infinite horizon
- average cost
- dynamic programming
- long run
- total reward
- multi agent
- natural actor critic
- evaluation function
- optimal control
- security policies
- reinforcement learning algorithms
- scheduling policies
- partially observable
- function approximators
- finite horizon