Login / Signup
Policy Learning for Balancing Short-Term and Long-Term Rewards.
Peng Wu
Ziyu Shen
Feng Xie
Zhongyao Wang
Chunchen Liu
Yan Zeng
Published in:
CoRR (2024)
Keyphrases
</>
reinforcement learning
learning process
learning algorithm
genetic algorithm
online learning
knowledge acquisition
optimal policy
learning systems
prior knowledge
active learning
learning tasks
learning problems
incremental learning
learning scheme
short term and long term
multi armed bandits