Phase-Parametric Policies for Reinforcement Learning in Cyclic Environments.
Arjun SharmaKris M. KitaniPublished in: AAAI (2018)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- markov decision processes
- control policies
- partially observable markov decision processes
- real world
- markov decision process
- learning algorithm
- control policy
- fitted q iteration
- function approximation
- hierarchical reinforcement learning
- continuous state
- reinforcement learning agents
- reward function
- reinforcement learning algorithms
- model free
- total reward
- finite state
- dynamic programming
- multi agent
- policy gradient methods
- machine learning
- policy iteration
- learning phase
- training phase
- dynamic environments
- state space
- markov decision problems
- decentralized control
- multiagent reinforcement learning
- partially observable