On the connection between Bregman divergence and value in regularized Markov decision processes.

Brendan O'Donoghue

Published in: CoRR (2022)

Keyphrases

markov decision processes
bregman divergences
cost sensitive
information theoretic
maximum entropy
mahalanobis distance
policy iteration
finite state
theoretical guarantees
nearest neighbor
learning theory
exponential family
loss function
reinforcement learning
state space
kl divergence
optimal policy
dynamic programming
special case
nonnegative matrix factorization
boosting algorithms
average cost
reinforcement learning algorithms
infinite horizon
data dependent
average reward
negative matrix factorization
linear programming