On the continuity and smoothness of the value function in reinforcement learning and optimal control.
Hans HarderSebastian PeitzPublished in: CoRR (2024)
Keyphrases
- optimal control
- reinforcement learning
- control problems
- dynamic programming
- risk sensitive
- policy gradient
- class of nonlinear systems
- feedback control
- infinite horizon
- control strategy
- optimal control problems
- brownian motion
- actor critic
- function approximators
- control law
- average cost
- real time
- state space
- cost function
- machine learning
- data mining
- learning problems
- linear quadratic
- learning algorithm
- lyapunov function
- rl algorithms
- control policy
- model free
- optimal policy
- function approximation
- markov decision processes