Online Reinforcement Learning in Markov Decision Process Using Linear Programming.
Vincent LéonS. Rasoul EtesamiPublished in: CoRR (2023)
Keyphrases
- markov decision process
- reinforcement learning
- linear programming
- state space
- optimal policy
- markov decision processes
- policy iteration
- dynamic programming
- infinite horizon
- temporal difference learning
- finite horizon
- average cost
- action space
- partial observability
- initial state
- linear program
- markov games
- transition probabilities
- reinforcement learning algorithms
- reward function
- data mining
- function approximation
- multistage
- np hard
- objective function
- machine learning
- partially observable
- optimal control
- state action
- markov decision problems
- bayesian networks