Login / Signup
Paavo Parmas
Publication Activity (10 Years)
Years Active: 2018-2023
Publications (10 Years): 10
Top Topics
Reinforcement Learning
Policy Gradient
Stochastic Gradient
Conditionally Independent
Top Venues
CoRR
ICML
NeurIPS
AISTATS
</>
Publications
</>
Paavo Parmas
,
Takuma Seno
,
Yuma Aoki
Model-based Reinforcement Learning with Scalable Composite Policy Gradient Estimators.
ICML
(2023)
Paavo Parmas
,
Takuma Seno
Proppo: a Message Passing Framework for Customizable and Composable Learning Algorithms.
NeurIPS
(2022)
Paavo Parmas
,
Masashi Sugiyama
A unified view of likelihood ratio and reparameterization gradients.
AISTATS
(2021)
Paavo Parmas
,
Masashi Sugiyama
A unified view of likelihood ratio and reparameterization gradients.
CoRR
(2021)
Daniel Hennes
,
Dustin Morrill
,
Shayegan Omidshafiei
,
Rémi Munos
,
Julien Pérolat
,
Marc Lanctot
,
Audrunas Gruslys
,
Jean-Baptiste Lespiau
,
Paavo Parmas
,
Edgar A. Duéñez-Guzmán
,
Karl Tuyls
Neural Replicator Dynamics: Multiagent Learning via Hedging Policy Gradients.
AAMAS
(2020)
Paavo Parmas
,
Masashi Sugiyama
A unified view of likelihood ratio and reparameterization gradients and an optimal importance sampling scheme.
CoRR
(2019)
Paavo Parmas
,
Carl Edward Rasmussen
,
Jan Peters
,
Kenji Doya
PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos.
CoRR
(2019)
Paavo Parmas
Total stochastic gradient algorithms and applications in reinforcement learning.
CoRR
(2019)
Paavo Parmas
Total stochastic gradient algorithms and applications in reinforcement learning.
NeurIPS
(2018)
Paavo Parmas
,
Carl Edward Rasmussen
,
Jan Peters
,
Kenji Doya
PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos.
ICML
(2018)