Balancing exploration and exploitation in episodic reinforcement learning.

Qihang Chen Qiwei Zhang Yunlong Liu

Published in: Expert Syst. Appl. (2023)

Keyphrases

balancing exploration and exploitation
reinforcement learning
learning to rank
function approximation
state space
multi agent
markov decision processes
reinforcement learning algorithms
optimal control
temporal difference
model free
learning algorithm
real robot
learning capabilities
action selection
learning problems
optimal policy
query expansion
learning process