Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets.
Denis SteckelmacherDiederik M. RoijersAnna HarutyunyanPeter VrancxAnn NowéPublished in: CoRR (2017)
Keyphrases
- reinforcement learning
- option pricing
- continuous state
- function approximation
- partially observable markov decision processes
- partially observable
- state space
- markov decision processes
- black scholes model
- multi agent
- optimal policy
- payoff functions
- policy search
- dynamic programming
- model free
- reinforcement learning algorithms
- policy gradient
- learning algorithm
- learning process
- decision analysis
- action space
- temporal difference
- markov decision process
- control policy
- average reward
- gradient method
- search algorithm
- action selection
- optimal control
- transfer learning