Infinite-Horizon Reach-Avoid Zero-Sum Games via Deep Reinforcement Learning.
Jingqi LiDonggun LeeSomayeh SojoudiClaire J. TomlinPublished in: CoRR (2022)
Keyphrases
- infinite horizon
- reinforcement learning
- optimal policy
- markov decision processes
- optimal control
- partially observable
- markov decision process
- state space
- finite horizon
- dynamic programming
- policy iteration
- production planning
- long run
- decision problems
- stochastic demand
- optimal strategy
- single item
- function approximation
- reinforcement learning algorithms
- dec pomdps
- average cost
- state dependent
- multistage
- multi agent
- finite state
- markov decision problems
- sufficient conditions
- control system
- lead time
- learning algorithm
- model free
- action space
- probabilistic model
- inventory level
- partially observable markov decision processes
- lost sales
- fixed cost
- temporal difference
- continuous state
- inventory policy
- total reward