Interplanetary Transfers via Deep Representations of the Optimal Policy and/or of the Value Function.
Dario IzzoEkin ÖztürkMarcus MärtensPublished in: CoRR (2019)
Keyphrases
- optimal policy
- decision problems
- markov decision processes
- infinite horizon
- state space
- finite horizon
- reinforcement learning
- dynamic programming
- state dependent
- long run
- multistage
- finite state
- control policies
- average reward
- markov decision process
- serial inventory systems
- sufficient conditions
- average cost
- data mining
- lost sales
- partially observable markov decision processes
- markov decision problems
- stochastic demand
- sample path
- objective function