Enhancing the episodic natural actor-critic algorithm by a regularisation term to stabilize learning of control structures.

Published in: ADPRL (2011)

Keyphrases