Self-Organisation of Generic Policies in Reinforcement Learning.
Simón C. SmithJ. Michael HerrmannPublished in: ECAL (2013)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- markov decision process
- control policies
- reward function
- function approximation
- markov decision processes
- fitted q iteration
- state space
- dynamic programming
- reinforcement learning agents
- markov decision problems
- reinforcement learning algorithms
- total reward
- cooperative multi agent systems
- partially observable markov decision processes
- learning algorithm
- policy gradient methods
- infinite horizon
- hierarchical reinforcement learning
- partially observable
- high level
- domain specific
- long run
- management policies
- continuous state
- decentralized control
- multiagent reinforcement learning
- monte carlo
- model free
- robotic control
- average reward
- control policy
- policy iteration