Reinforcement Learning with Immediate Rewards and Linear Hypotheses.

Naoki Abe Alan W. Biermann Philip M. Long

Published in: Algorithmica (2003)

Keyphrases

reinforcement learning
markov decision processes
state space
function approximation
reinforcement learning algorithms
learning algorithm
function approximators
machine learning
model free
learning problems
learning process
optimal policy
dynamic programming
linear systems
temporal difference
partially observable
reinforcement learning methods
reward shaping
action selection
optimal control
dynamical systems
search space
multi agent
bayesian networks
genetic algorithm