Policy Learning for Balancing Short-Term and Long-Term Rewards.

Peng Wu Ziyu Shen Feng Xie Zhongyao Wang Chunchen Liu Yan Zeng

Published in: CoRR (2024)

Keyphrases

reinforcement learning
learning process
learning algorithm
genetic algorithm
online learning
knowledge acquisition
optimal policy
learning systems
prior knowledge
active learning
learning tasks
learning problems
incremental learning
learning scheme
short term and long term
multi armed bandits