The Continuous-Time Joint Replenishment Problem: ε-Optimal Policies via Pairwise Alignment.
Danny SegevPublished in: CoRR (2023)
Keyphrases
- optimal policy
- pairwise
- state space
- global alignment
- sequence alignment
- markov decision processes
- dynamic programming
- infinite horizon
- reinforcement learning
- decision problems
- finite horizon
- state dependent
- markov chain
- long run
- multistage
- markov decision process
- dynamic programming algorithms
- optimal control
- average reward
- stationary policies
- finite state
- serial inventory systems
- dynamical systems
- policy iteration
- control policies
- initial state
- sufficient conditions
- average reward reinforcement learning
- average cost
- similarity measure
- markov random field
- markov processes
- belief propagation
- lost sales
- loss function
- random variables
- partially observable markov decision processes
- search algorithm
- stochastic processes
- total reward
- model free
- data mining