Hyperbolically-Discounted Reinforcement Learning on Reward-Punishment Framework.

Taisuke Kobayashi

Published in: CoRR (2021)

Keyphrases

reinforcement learning
optimal policy
dynamic programming
state space
markov decision processes
function approximation
multi agent
learning algorithm
learning process
main contribution
theoretical framework
transition model