Finite-Time Convergence and Sample Complexity of Actor-Critic Multi-Objective Reinforcement Learning.
Tianchen Zhou HairiHaibo YangJia LiuTian TongFan YangMichinari MommaYan GaoPublished in: CoRR (2024)
Keyphrases
- actor critic
- sample complexity
- multi objective
- reinforcement learning
- learning problems
- learning algorithm
- supervised learning
- temporal difference
- policy gradient
- rl algorithms
- theoretical analysis
- reinforcement learning algorithms
- evolutionary algorithm
- approximate dynamic programming
- optimal control
- upper bound
- gradient method
- neuro fuzzy
- special case
- generalization error
- function approximation
- lower bound
- policy iteration
- objective function
- convergence rate
- model free
- genetic algorithm
- active learning
- markov decision processes
- learning tasks
- multi agent
- training examples
- state space
- sample size
- average reward
- function approximators
- convergence speed
- machine learning
- optimal policy
- data sets
- semi supervised learning
- action selection
- transfer learning
- reinforcement learning methods
- training set
- feature selection
- differential evolution