Model-free two-step design for improving transient learning performance in nonlinear optimal regulator problems.
Yuka MasumotoYoshihiro OkawaTomotake SasakiYutaka HoriPublished in: CoRR (2021)
Keyphrases
- model free
- reinforcement learning
- reinforcement learning methods
- learning process
- rl algorithms
- learning algorithm
- steady state
- optimal solution
- active learning
- learning tasks
- learning problems
- text classification
- temporal difference learning
- temporal difference
- machine learning algorithms
- neural network
- training set
- genetic algorithm
- data mining