Inferring Probabilistic Reward Machines from Non-Markovian Reward Signals for Reinforcement Learning.
Taylor DohmenNoah TopperGeorge K. AtiaAndre BeckusAshutosh TrivediAlvaro VelasquezPublished in: ICAPS (2022)
Keyphrases
- reinforcement learning
- reward function
- eligibility traces
- state space
- function approximation
- signal processing
- reinforcement learning algorithms
- learning problems
- markov decision processes
- optimal policy
- average reward
- multi agent
- bayesian networks
- model free
- dynamic programming
- transfer learning
- reinforcement learning methods
- learning algorithm
- uncertain data
- temporal difference
- optimal control
- neural network
- generative model
- spectral analysis
- learning agent
- sufficient conditions
- reinforcement learning agents
- probability distribution
- total reward
- partially observable environments