Balancing exploration and exploitation in episodic reinforcement learning.
Qihang ChenQiwei ZhangYunlong LiuPublished in: Expert Syst. Appl. (2023)
Keyphrases
- balancing exploration and exploitation
- reinforcement learning
- learning to rank
- function approximation
- state space
- multi agent
- markov decision processes
- reinforcement learning algorithms
- optimal control
- temporal difference
- model free
- learning algorithm
- real robot
- learning capabilities
- action selection
- learning problems
- optimal policy
- query expansion
- learning process