Optimal Policies for Security Patch Management.
Debabrata DeyAtanu LahiriGuoying ZhangPublished in: INFORMS J. Comput. (2015)
Keyphrases
- optimal policy
- markov decision processes
- decision problems
- reinforcement learning
- finite horizon
- long run
- state space
- multistage
- infinite horizon
- finite state
- dynamic programming
- average cost
- state dependent
- dynamic programming algorithms
- sufficient conditions
- access control
- bayesian reinforcement learning
- average reward reinforcement learning
- average reward
- reward function
- initial state
- serial inventory systems
- markov decision problems
- control policies
- rfid technology
- total reward
- markov decision process
- policy iteration
- search algorithm
- multi agent