Login / Signup
Kevin Ro Wang
Publication Activity (10 Years)
Years Active: 2022-2023
Publications (10 Years): 3
Top Topics
Multilayer Perceptron
Biologically Plausible
Concept Space
Neural Architecture
Top Venues
EMNLP
CoRR
ICLR
</>
Publications
</>
Kevin Ro Wang
,
Alexandre Variengien
,
Arthur Conmy
,
Buck Shlegeris
,
Jacob Steinhardt
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 Small.
ICLR
(2023)
Mor Geva
,
Avi Caciularu
,
Kevin Ro Wang
,
Yoav Goldberg
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space.
EMNLP
(2022)
Mor Geva
,
Avi Caciularu
,
Kevin Ro Wang
,
Yoav Goldberg
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space.
CoRR
(2022)