On Softwarization of Intelligence in 6G Networks for Ultra-Fast Optimal Policy Selection: Challenges and Opportunities.
Sherief HashimaZubair Md. FadlullahMostafa M. FoudaEhab Mahmoud MohamedKohei HatanoBasem M. ElHalawanyMohsen GuizaniPublished in: IEEE Netw. (2023)
Keyphrases
- optimal policy
- decision problems
- markov decision processes
- reinforcement learning
- state dependent
- state space
- finite horizon
- infinite horizon
- dynamic programming
- long run
- average reward
- sufficient conditions
- multistage
- finite state
- bayesian reinforcement learning
- markov decision process
- average cost
- optimal pricing
- control policies
- serial inventory systems
- learning algorithm
- lost sales
- partially observable markov decision processes
- develop a mathematical model
- reward function