Low latency policy iteration via parallel processing and randomization.
Neal MasterNicholas BambosPublished in: CDC (2015)
Keyphrases
- parallel processing
- low latency
- policy iteration
- markov decision processes
- least squares
- model free
- reinforcement learning
- fixed point
- optimal policy
- high speed
- high throughput
- highly efficient
- temporal difference
- markov decision process
- real time
- stream processing
- infinite horizon
- finite state
- optimal control
- convergence rate
- linear programming
- state space
- virtual machine
- dynamic programming
- parallel computers
- function approximation
- computational complexity
- graphical models
- optimal solution