DAC: Quantized Optimal Transport Reward-based Reinforcement Learning Approach to Detoxify Query Auto-Completion.
Aishwarya MaheswaranKaushal Kumar MauryaManish GuptaMaunendra Sankar DesarkarPublished in: SIGIR (2024)
Keyphrases
- reinforcement learning
- optimal control
- database
- dynamic programming
- query processing
- response time
- total reward
- control policy
- database queries
- query evaluation
- state space
- markov decision processes
- average reward
- function approximation
- data sources
- action selection
- machine learning
- multi armed bandit
- policy gradient
- reinforcement learning algorithms
- reward function
- query formulation
- multi agent
- data structure
- user queries
- database systems
- keywords
- model free
- temporal difference
- optimal solution
- learning algorithm
- information retrieval
- initially unknown
- user interaction
- eligibility traces
- query expansion