Split Q Learning: Reinforcement Learning with Two-Stream Rewards.

Baihan Lin Djallel Bouneffouf Guillermo A. Cecchi

Published in: CoRR (2019)

Keyphrases

reinforcement learning
function approximation
reinforcement learning algorithms
data streams
model free
state space
markov decision processes
state action space
control problems
learning algorithm
machine learning
dynamic programming
temporal difference
sliding window
multi agent
optimal policy
temporal difference learning
reinforcement learning methods
learning problems
reward shaping
eligibility traces
optimal control
continuous state and action spaces
action selection
continuous state
function approximators
policy iteration
learning classifier systems