Feasible Q-Learning for Average Reward Reinforcement Learning.
Ying JinRamki GummadiZhengyuan ZhouJose H. BlanchetPublished in: AISTATS (2024)
Keyphrases
- average reward reinforcement learning
- optimal policy
- reinforcement learning
- state space
- function approximation
- cooperative
- reinforcement learning algorithms
- dynamic programming
- markov decision processes
- learning algorithm
- multi agent
- feasible solution
- long run
- reward function
- sufficient conditions
- model free
- markov decision process
- temporal difference learning
- multi agent reinforcement learning
- tabu search
- action selection
- real time
- artificial neural networks
- neural network
- databases