Publication: Hands-on Reinforcement Learning for Recommender Systems - From Bandits to SlateQ to Offline RL with Ray RLlib.