Episodic Reinforcement Learning with Expanded State-reward Space.

Dayang Liang Yaru Zhang Yunlong Liu

Published in: CoRR (2024)

Keyphrases

reinforcement learning
state space
action space
search space
total reward
markov decision processes
multi agent
learning algorithm
temporal difference
learning process
function approximation
eligibility traces
state action
state abstraction
agent receives
transition model
reinforcement learning algorithms
state variables
optimal policy
low dimensional