Login / Signup
Rewarded Region Replay (R3) for Policy Learning with Discrete Action Space.
Bangzheng Li
Ningshan Ma
Zifan Wang
Published in:
CoRR (2024)
Keyphrases
</>
action space
state space
continuous state spaces
learning algorithm
action selection
supervised learning
reinforcement learning
prior knowledge
markov random field
sufficient conditions
real valued
continuous state