A Continuous Actor-Critic Reinforcement Learning Approach to Flocking with Fixed-Wing UAVs.
Chang WangChao YanXiaojia XiangHan ZhouPublished in: ACML (2019)
Keyphrases
- actor critic
- reinforcement learning
- temporal difference
- policy gradient
- approximate dynamic programming
- reinforcement learning algorithms
- optimal control
- neuro fuzzy
- gradient method
- function approximation
- policy iteration
- action space
- state space
- markov decision processes
- policy gradient methods
- control algorithm
- path planning
- dynamic programming
- natural actor critic
- machine learning
- rl algorithms
- temporal difference learning
- average reward
- supervised learning
- control problems
- dynamic environments
- optimal policy
- model free
- transfer learning
- linear program
- learning algorithm
- least squares
- optimal solution
- control system
- reinforcement learning methods
- learning problems
- evaluation function