Predictive Estimation for Reinforcement Learning with Time-Varying Reward Functions.

Abolfazl Hashemi Antesh Upadhyay

Published in: ACSSC (2023)

Keyphrases

reward function
reinforcement learning
reinforcement learning algorithms
policy search
markov decision processes
optimal policy
state space
markov decision process
inverse reinforcement learning
partially observable
multiple agents
temporal difference
model free
simple examples
function approximation
machine learning
transition probabilities
markov chain
dynamic programming
multi agent
state action
transition model
data mining
particle filter
learning agent
learning algorithm