Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation.

Wooseong Cho Taehyun Hwang Joongkyu Lee Min-hwan Oh

Published in: CoRR (2024)

Keyphrases

function approximation
reinforcement learning
exploration exploitation tradeoff
temporal difference learning
function approximators
action selection
state action space
tile coding
temporal difference
model free
mountain car
radial basis function
reinforcement learning algorithms
temporal difference learning algorithms
learning tasks
state space
machine learning
text categorization
text classification
learning process
learning algorithm
markov decision processes
control problems
optimal policy
td learning
dynamic programming
feature vectors
multi agent