Scaling Up Q-Learning via Exploiting State-Action Equivalence.

Yunlian Lyu Aymeric Côme Yijie Zhang Mohammad Sadegh Talebi

Published in: Entropy (2023)

Keyphrases

state action
reinforcement learning
evaluation function
stochastic games
action space
state transitions
average reward
markov decision process
continuous state
state space
belief state
kernel matrix
reward function
function approximators
real valued