Use of Successful Policies to Relearn for Induced States of Failure in Reinforcement Learning.
Tadahiko MurataHiroshi MatsumotoPublished in: KES (2004)
Keyphrases
- reinforcement learning
- optimal policy
- markov decision problems
- perceptual aliasing
- policy search
- initial state
- success or failure
- markov decision process
- control policies
- reinforcement learning agents
- state abstraction
- function approximation
- reward function
- markov decision processes
- state space
- reinforcement learning algorithms
- continuous state
- hierarchical reinforcement learning
- partially observable
- dynamic programming
- state action
- transition model
- behavioural cloning
- fitted q iteration
- temporal difference
- learning process
- multi agent
- root cause
- control policy
- function approximators
- reward shaping
- learning algorithm
- state information
- reinforcement learning methods
- partial knowledge
- state transitions
- policy iteration
- transition probabilities
- decision problems