Convergence of the Q-ae learning under deterministic MDPs and its efficiency under the stochastic environment.

Gang Zhao Ruoying Sun Shoji Tatsumi

Published in: SMC (2000)

Keyphrases

reinforcement learning
learning process
online learning
neural network
prior knowledge
active learning
dynamic environments
mobile learning
machine learning
state space
linear program
partially observable
simulated robot
stochastic domains