Q-Learning: Solutions for Grid World Problem with Forward and Backward Reward Propagations.

Snobin Antony Raghi Roy Yaxin Bi

Published in: SGAI Conf. (2023)

Keyphrases

forward and backward
reinforcement learning
dynamic programming
multi agent
function approximation
learning algorithm
cooperative
optimal solution
solution space
action selection
discounted reward
lower bound
artificial neural networks
state space
metaheuristic
reinforcement learning algorithms