Principal-Agent Reward Shaping in MDPs.
Omer Ben-PoratYishay MansourMichal MoshkovitzBoaz TaitlerPublished in: CoRR (2024)
Keyphrases
- reward shaping
- principal agent
- markov decision problems
- reinforcement learning
- markov decision processes
- reinforcement learning algorithms
- state space
- reward function
- linear programming
- optimal policy
- markov decision process
- partially observable
- assembly systems
- moral hazard
- decision theoretic
- function approximation
- dynamic programming
- policy search
- decision processes
- transition probabilities
- utility function
- policy iteration
- complex domains
- average cost
- temporal difference
- queueing networks
- expected utility
- transition model
- markov chain
- machine learning
- action space
- lot sizing
- optimal control
- game theory
- sufficient conditions
- learning algorithm
- finite state
- finite horizon
- linear program