Markov Rewards Processes with Impulse Rewards and Absorbing States.
Louis TanKaveh MahdavianiAshish KhistiPublished in: CoRR (2021)
Keyphrases
- markov chain
- multiarmed bandit
- reinforcement learning
- markov decision processes
- bandit problems
- fully observable
- long term and short term
- credit assignment
- transition probabilities
- real world
- free riding
- markov process
- decision making
- markov model
- decision problems
- state space
- artificial intelligence
- learning algorithm
- machine learning
- neural network