Supervised actor-critic reinforcement learning with action feedback for algorithmic trading.
Qizhou SunYain-Whar SiPublished in: Appl. Intell. (2023)
Keyphrases
- actor critic
- reinforcement learning
- temporal difference
- reinforcement learning algorithms
- action selection
- policy gradient
- learning algorithm
- optimal control
- approximate dynamic programming
- supervised learning
- action space
- neuro fuzzy
- gradient method
- function approximation
- machine learning
- policy iteration
- state action
- markov decision processes
- semi supervised
- reinforcement learning methods
- state space
- dynamic programming
- control problems
- model free
- rl algorithms
- multi agent
- linear programming
- average reward
- linear program
- temporal difference learning
- optimal policy
- finite state
- real valued
- convergence rate
- monte carlo
- labeled data