Embedded draw-down constraint reward function for deep reinforcement learning.
Jimmy Ming-Tai WuSheng-Hao LinJia-Hao SyuMu-En WuPublished in: Appl. Soft Comput. (2022)
Keyphrases
- reward function
- reinforcement learning
- reinforcement learning algorithms
- markov decision processes
- state space
- partially observable
- optimal policy
- policy search
- markov decision process
- inverse reinforcement learning
- transition probabilities
- multiple agents
- model free
- transition model
- function approximation
- multi agent
- initially unknown
- temporal difference
- hierarchical reinforcement learning
- learning agent
- action space
- state action
- markov decision problems
- state abstraction
- machine learning
- state variables
- learning algorithm
- markov models
- dynamic programming