Shaping Proto-Value Functions Using Rewards.
Raj Kumar MaityChandrashekar LakshminarayananSindhu PadakandlaShalabh BhatnagarPublished in: ECAI (2016)
Keyphrases
- reward shaping
- reinforcement learning
- markov decision processes
- complex domains
- multiarmed bandit
- bandit problems
- reinforcement learning algorithms
- state space
- long term and short term
- credit assignment
- management system
- free riding
- reward function
- markov decision problems
- databases
- search algorithm
- multiscale
- decision trees
- image processing