Online Reinforcement Learning in Markov Decision Process Using Linear Programming.
Vincent LéonS. Rasoul EtesamiPublished in: CDC (2023)
Keyphrases
- markov decision process
- reinforcement learning
- linear programming
- optimal policy
- state space
- markov decision processes
- policy iteration
- temporal difference learning
- dynamic programming
- infinite horizon
- finite horizon
- linear program
- initial state
- partial observability
- state action
- markov games
- temporal difference
- optimal control
- action space
- optimal solution
- graphical models
- learning algorithm
- transition probabilities
- control problems
- reinforcement learning algorithms
- reward function
- function approximation
- model free
- multiagent systems
- np hard
- special case
- control system
- multi agent
- markov decision problems
- objective function
- machine learning