New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system.
Katsuhisa OhnoToshitaka BohKoichi NakadeTakayoshi TamuraPublished in: Eur. J. Oper. Res. (2016)
Keyphrases
- markov decision processes
- dynamic programming algorithms
- optimal policy
- dynamic programming
- policy iteration
- state space
- finite state
- reinforcement learning
- finite horizon
- average reward
- infinite horizon
- planning under uncertainty
- stochastic games
- decision processes
- reward function
- partially observable
- action space
- decision diagrams
- reinforcement learning algorithms
- markov decision process
- decision problems
- search space
- markov decision problems
- machine learning