Constructing optimal policies for agents with constrained architectures.
Dmitri A. DolgovEdmund H. DurfeePublished in: AAMAS (2003)
Keyphrases
- optimal policy
- markov decision processes
- decision problems
- multi agent systems
- multi agent
- reinforcement learning
- dynamic programming
- state space
- single agent
- finite horizon
- multistage
- state dependent
- expected reward
- multiple agents
- infinite horizon
- average reward
- software agents
- long run
- finite state
- dynamic programming algorithms
- initial state
- control policies
- average cost
- serial inventory systems
- markov decision process
- resource allocation
- sufficient conditions
- decision making
- markov decision problems
- bayesian reinforcement learning
- learning algorithm
- total reward
- average reward reinforcement learning
- data mining