Login / Signup
Bo Wu
ORCID
Publication Activity (10 Years)
Years Active: 2011-2024
Publications (10 Years): 32
Top Topics
Subgraph Isomorphism
Graph Pattern Matching
Language Understanding
Data Placement
Top Venues
PACT
ICS
CoRR
LCPC
</>
Publications
</>
Akshit Sharma
,
Dinesh P. Mehta
,
Bo Wu
Understanding High-Performance Subgraph Pattern Matching: A Systems Perspective.
GRADES/NDA
(2024)
Benjamin Wagley
,
Pak Markthub
,
James Crea
,
Bo Wu
,
Mehmet Esat Belviranli
Exploring Page-based RDMA for Irregular GPU Workloads. A case study on NVMe-backed GNN Execution.
GPGPU@PPoPP
(2024)
Jin Zhou
,
Sam Silvestro
,
Steven (Jiaxun) Tang
,
Hanmei Yang
,
Hongyu Liu
,
Guangming Zeng
,
Bo Wu
,
Cong Liu
,
Tongping Liu
MemPerf: Profiling Allocator-Induced Performance Slowdowns.
Proc. ACM Program. Lang.
7 (OOPSLA2) (2023)
Connor Holmes
,
Minjia Zhang
,
Yuxiong He
,
Bo Wu
Compressing Pre-trained Transformers via Low-Bit NxM Sparsity for Natural Language Understanding.
CoRR
(2022)
Wei Han
,
Connor Holmes
,
Bo Wu
DGSM: A GPU-Based Subgraph Isomorphism framework with DFS exploration.
RSDHA@SC
(2022)
Daniel Mawhirter
,
Sam Reinehr
,
Wei Han
,
Noah Fields
,
Miles Claver
,
Connor Holmes
,
Jedidiah McClurg
,
Tongping Liu
,
Bo Wu
Dryadic: Flexible and Fast Graph Pattern Matching at Scale.
PACT
(2021)
Daniel Mawhirter
,
Sam Reinehr
,
Connor Holmes
,
Tongping Liu
,
Bo Wu
GraphZero: A High-Performance Subgraph Matching System.
ACM SIGOPS Oper. Syst. Rev.
55 (1) (2021)
Feng Zhang
,
Jidong Zhai
,
Bo Wu
,
Bingsheng He
,
Wenguang Chen
,
Xiaoyong Du
Automatic Irregularity-Aware Fine-Grained Workload Partitioning on Integrated Architectures.
IEEE Trans. Knowl. Data Eng.
33 (3) (2021)
Jordan Schmerge
,
Daniel Mawhirter
,
Connor Holmes
,
Jedidiah McClurg
,
Bo Wu
ELIχR: Eliminating Computation Redundancy in CNN-Based Video Processing.
RSDHA@SC
(2021)
Connor Holmes
,
Minjia Zhang
,
Yuxiong He
,
Bo Wu
NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM.
CoRR
(2021)
Connor Holmes
,
Minjia Zhang
,
Yuxiong He
,
Bo Wu
NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM.
NeurIPS
(2021)
Connor Holmes
,
Daniel Mawhirter
,
Yuxiong He
,
Feng Yan
,
Bo Wu
GRNN: Low-Latency and Scalable RNN Inference on GPUs.
EuroSys
(2019)
Wei Han
,
Daniel Mawhirter
,
Bo Wu
,
Lin Ma
,
Chen Tian
FLARE: Flexibly Sharing Commodity GPUs to Enforce QoS and Improve Utilization.
LCPC
(2019)
Daniel Mawhirter
,
Sam Reinehr
,
Connor Holmes
,
Tongping Liu
,
Bo Wu
GraphZero: Breaking Symmetry for Efficient Graph Mining.
CoRR
(2019)
Wei Zhang
,
Weihao Cui
,
Kaihua Fu
,
Quan Chen
,
Daniel Edward Mawhirter
,
Bo Wu
,
Chao Li
,
Minyi Guo
Laius: Towards latency awareness and improved utilization of spatial multitasking accelerators in datacenters.
ICS
(2019)
Daniel Mawhirter
,
Bo Wu
AutoMine: harmonizing high-level abstraction and high performance for graph mining.
SOSP
(2019)
Qi Zhu
,
Bo Wu
,
Xipeng Shen
,
Kai Shen
,
Li Shen
,
Zhiying Wang
Resolving the GPU responsiveness dilemma through program transformations.
Frontiers Comput. Sci.
12 (3) (2018)
Zhen Peng
,
Alexander Powell
,
Bo Wu
,
Tekin Bicer
,
Bin Ren
Graphphi: efficient parallel graph processing on emerging throughput-oriented architectures.
PACT
(2018)
Daniel Mawhirter
,
Bo Wu
,
Dinesh P. Mehta
,
Chao Ai
ApproxG: Fast Approximate Parallel Graphlet Counting Through Accuracy Control.
CCGrid
(2018)
Junqiao Qiu
,
Zhijia Zhao
,
Bo Wu
,
Abhinav Vishnu
,
Shuaiwen Leon Song
Enabling scalability-sensitive speculative parallelization for FSM computations.
ICS
(2017)
Wei Han
,
Daniel Mawhirter
,
Bo Wu
,
Matthew Buland
Graphie: Large-Scale Asynchronous Graph Traversals on Just a GPU.
PACT
(2017)
Qi Zhu
,
Bo Wu
,
Xipeng Shen
,
Li Shen
,
Zhiying Wang
Co-Run Scheduling with Power Cap on Integrated CPU-GPU Systems.
IPDPS
(2017)
Feng Zhang
,
Bo Wu
,
Jidong Zhai
,
Bingsheng He
,
Wenguang Chen
FinePar: irregularity-aware fine-grained workload partitioning on integrated architectures.
CGO
(2017)
Jing Zheng
,
Jianhua Sun
,
Kun Sun
,
Bo Wu
,
Qi Li
Cookie-based amplification repression protocol.
IPCCC
(2017)
Bo Wu
,
Xu Liu
,
Xiaobo Zhou
,
Changjun Jiang
FLEP: Enabling Flexible and Efficient Preemption on GPUs.
ASPLOS
(2017)
Guoyang Chen
,
Xipeng Shen
,
Bo Wu
,
Dong Li
Optimizing Data Placement on GPU Memory: A Portable Approach.
IEEE Trans. Computers
66 (3) (2017)
Qi Zhu
,
Bo Wu
,
Xipeng Shen
,
Kai Shen
,
Li Shen
,
Zhiying Wang
Understanding co-run performance on CPU-GPU integrated processors: observations, insights, directions.
Frontiers Comput. Sci.
11 (1) (2017)
Mingzhou Zhou
,
Bo Wu
,
Xipeng Shen
,
Yaoqing Gao
,
Graham Yiu
Examining and Reducing the Influence of Sampling Errors on Feedback-Driven Optimizations.
ACM Trans. Archit. Code Optim.
13 (1) (2016)
Xu Liu
,
Bo Wu
ScaAnalyzer: a tool to identify memory scalability bottlenecks in parallel programs.
SC
(2015)
Guoyang Chen
,
Bo Wu
,
Dong Li
,
Xipeng Shen
Enabling Portable Optimizations of Data Placement on GPU.
IEEE Micro
35 (4) (2015)
Qi Zhu
,
Meng Zhu
,
Bo Wu
,
Xipeng Shen
,
Kai Shen
,
Zhiying Wang
Software Engagement with Sleeping CPUs.
HotOS
(2015)
Bo Wu
,
Guoyang Chen
,
Dong Li
,
Xipeng Shen
,
Jeffrey S. Vetter
Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program Transformations.
ICS
(2015)
Guoyang Chen
,
Bo Wu
,
Dong Li
,
Xipeng Shen
PORPLE: An Extensible Optimizer for Portable Data Placement on GPU.
MICRO
(2014)
Bo Wu
,
Guoyang Chen
,
Dong Li
,
Xipeng Shen
,
Jeffrey S. Vetter
SM-centric transformation: circumventing hardware restrictions for flexible GPU scheduling.
PACT
(2014)
Zhijia Zhao
,
Bo Wu
,
Xipeng Shen
Challenging the "embarrassingly sequential": parallelizing finite state machine-based computations through principled speculation.
ASPLOS
(2014)
Qi Zhu
,
Bo Wu
,
Xipeng Shen
,
Li Shen
,
Zhiying Wang
Understanding Co-run Degradations on Integrated Heterogeneous Processors.
LCPC
(2014)
Zhijia Zhao
,
Bo Wu
,
Mingzhou Zhou
,
Yufei Ding
,
Jianhua Sun
,
Xipeng Shen
,
Youfeng Wu
Call sequence prediction through probabilistic calling automata.
OOPSLA
(2014)
Bo Wu
,
Weilin Wang
,
Xipeng Shen
Software-level scheduling to exploit non-uniformly shared data cache on GPGPU.
MSPC@PLDI
(2013)
Bo Wu
,
Mingzhou Zhou
,
Xipeng Shen
,
Yaoqing Gao
,
Raúl Silvera
,
Graham Yiu
Simple Profile Rectifications Go a Long Way - Statistically Exploring and Alleviating the Effects of Sampling Errors for Program Optimizations.
ECOOP
(2013)
Bo Wu
,
Zhijia Zhao
,
Eddy Zheng Zhang
,
Yunlian Jiang
,
Xipeng Shen
Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU.
PPOPP
(2013)
Bin Wang
,
Bo Wu
,
Dong Li
,
Xipeng Shen
,
Weikuan Yu
,
Yizheng Jiao
,
Jeffrey S. Vetter
Exploring hybrid memory for GPU energy efficiency through software-hardware co-design.
PACT
(2013)
Mingzhou Zhou
,
Bo Wu
,
Yufei Ding
,
Xipeng Shen
Profmig: A framework for flexible migration of program profiles across software versions.
CGO
(2013)
Zhijia Zhao
,
Bo Wu
,
Xipeng Shen
Speculative parallelization needs rigor: probabilistic analysis for optimal speculation of finite-state machine applications.
PACT
(2012)
Bo Wu
,
Zhijia Zhao
,
Xipeng Shen
,
Yunlian Jiang
,
Yaoqing Gao
,
Raúl Silvera
Exploiting inter-sequence correlations for program behavior prediction.
OOPSLA
(2012)
Ziyu Guo
,
Bo Wu
,
Xipeng Shen
One stone two birds: synchronization relaxation and redundancy removal in GPU-CPU translation.
ICS
(2012)
Bo Wu
,
Eddy Z. Zhang
,
Xipeng Shen
Enhancing Data Locality for Dynamic Simulations through Asynchronous Data Transformations and Adaptive Control.
PACT
(2011)