Quantum Acceleration of Infinite Horizon Average-Reward Reinforcement Learning.
Bhargav GangulyVaneet AggarwalPublished in: CoRR (2023)
Keyphrases
- average reward reinforcement learning
- infinite horizon
- optimal policy
- finite horizon
- markov decision processes
- long run
- decision problems
- state space
- dynamic programming
- markov decision process
- stochastic demand
- production planning
- reinforcement learning
- finite state
- partially observable
- single item
- state dependent
- multistage
- optimal control
- asymptotically optimal
- dec pomdps
- inventory models
- average cost
- holding cost
- fixed cost
- sufficient conditions
- steady state
- inventory level
- periodic review
- inventory policy
- machine learning