C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Elvis Rojas
ORCID
Publication Activity (10 Years)
Years Active: 2019-2022
Publications (10 Years): 8
Top Topics
Data Corruption
Intra Class
Learning Models
Floating Point
Top Venues
CLUSTER
SBAC-PAD
CoRR
Euro-Par
</>
Publications
</>
Elvis Rojas
,
Diego Pérez
,
Esteban Meneses
Exploring the Effects of Silent Data Corruption in Distributed Deep Learning Training.
SBAC-PAD
(2022)
Elvis Rojas
,
Michael Knobloch
,
Nour Daoud
,
Esteban Meneses
,
Bernd Mohr
Early Experiences of Noise-Sensitivity Performance Analysis of a Distributed Deep Learning Framework.
CLUSTER
(2022)
Elvis Rojas
,
Esteban Meneses
,
Terry Jones
,
Don Maxwell
Understanding failures through the lifetime of a top-level supercomputer.
J. Parallel Distributed Comput.
154 (2021)
Elvis Rojas
,
Diego Pérez
,
Jon C. Calhoun
,
Leonardo Bautista-Gomez
,
Terry Jones
,
Esteban Meneses
Understanding Soft Error Sensitivity of Deep Learning Models and Frameworks through Checkpoint Alteration.
CLUSTER
(2021)
Elvis Rojas
,
Fabricio Quirós-Corella
,
Terry Jones
,
Esteban Meneses
Large-Scale Distributed Deep Learning: A Study of Mechanisms and Trade-Offs with PyTorch.
CARLA
(2021)
Elvis Rojas
,
Esteban Meneses
,
Terry Jones
,
Don Maxwell
Towards a Model to Estimate the Reliability of Large-Scale Hybrid Supercomputers.
Euro-Par
(2020)
Elvis Rojas
,
Albert Njoroge Kahira
,
Esteban Meneses
,
Leonardo Bautista-Gomez
,
Rosa M. Badia
A Study of Checkpointing in Large Scale Training of Deep Neural Networks.
CoRR
(2020)
Elvis Rojas
,
Esteban Meneses
,
Terry Jones
,
Don Maxwell
Analyzing a Five-Year Failure Record of a Leadership-Class Supercomputer.
SBAC-PAD
(2019)