Partially observable discrete-time discounted Markov games with general utility.
Arnab BhabakSubhamay SahaPublished in: Oper. Res. Lett. (2024)
Keyphrases
- partially observable
- markov decision processes
- infinite horizon
- reinforcement learning
- special case
- state space
- decision problems
- markov decision problems
- dynamical systems
- dynamic programming
- reinforcement learning algorithms
- markov chain
- finite state
- reward function
- computational complexity
- long run
- belief state
- control system
- markov decision process