Reward Poisoning Attack Against Offline Reinforcement Learning.

Yinglun Xu Rohan Gumaste Gagandeep Singh

Published in: CoRR (2024)

Keyphrases

reinforcement learning
state space
function approximation
eligibility traces
reinforcement learning algorithms
learning algorithm
real time
dynamic programming
temporal difference
model free
partially observable environments
supervised learning
markov decision processes
function approximators
reward function
action selection
optimal control
optimal policy
total reward
learning capabilities
malicious users
policy gradient
countermeasures
dos attacks
policy iteration
learning problems
mobile robot
machine learning
neural network