​
Login / Signup
Bochuan Cao
Publication Activity (10 Years)
Years Active: 2022-2024
Publications (10 Years): 17
Top Topics
Ad Hoc Information Retrieval
Diffusion Models
Language Model
Query Terms
Top Venues
CoRR
ACL (1)
NeurIPS
USENIX Security Symposium
</>
Publications
</>
Bochuan Cao
,
Yuanpu Cao
,
Lu Lin
,
Jinghui Chen
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM.
ACL (1)
(2024)
Tianrong Zhang
,
Bochuan Cao
,
Yuanpu Cao
,
Lu Lin
,
Prasenjit Mitra
,
Jinghui Chen
WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response.
CoRR
(2024)
Changjiang Li
,
Ren Pang
,
Bochuan Cao
,
Zhaohan Xi
,
Jinghui Chen
,
Shouling Ji
,
Ting Wang
On the Difficulty of Defending Contrastive Learning against Backdoor Attacks.
USENIX Security Symposium
(2024)
Yuanpu Cao
,
Tianrong Zhang
,
Bochuan Cao
,
Ziyi Yin
,
Lu Lin
,
Fenglong Ma
,
Jinghui Chen
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization.
CoRR
(2024)
Yurui Chang
,
Bochuan Cao
,
Yujia Wang
,
Jinghui Chen
,
Lu Lin
XPrompt:Explaining Large Language Model's Generation via Joint Prompt Attribution.
CoRR
(2024)
Guangliang Liu
,
Haitao Mao
,
Bochuan Cao
,
Zhiyu Xue
,
Kristen Marie Johnson
,
Jiliang Tang
,
Rongrong Wang
On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept.
CoRR
(2024)
Hangfan Zhang
,
Zhimeng Guo
,
Huaisheng Zhu
,
Bochuan Cao
,
Lu Lin
,
Jinyuan Jia
,
Jinghui Chen
,
Dinghao Wu
Jailbreak Open-Sourced Large Language Models via Enforced Decoding.
ACL (1)
(2024)
Yuanpu Cao
,
Bochuan Cao
,
Jinghui Chen
Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections.
NAACL-HLT
(2024)
Changjiang Li
,
Ren Pang
,
Bochuan Cao
,
Jinghui Chen
,
Fenglong Ma
,
Shouling Ji
,
Ting Wang
Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models.
CoRR
(2024)
Yuanpu Cao
,
Bochuan Cao
,
Jinghui Chen
Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections.
CoRR
(2023)
Bochuan Cao
,
Yuanpu Cao
,
Lu Lin
,
Jinghui Chen
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM.
CoRR
(2023)
Bochuan Cao
,
Changjiang Li
,
Ting Wang
,
Jinyuan Jia
,
Bo Li
,
Jinghui Chen
IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI.
NeurIPS
(2023)
Bochuan Cao
,
Changjiang Li
,
Ting Wang
,
Jinyuan Jia
,
Bo Li
,
Jinghui Chen
IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI.
CoRR
(2023)
Changjiang Li
,
Ren Pang
,
Bochuan Cao
,
Zhaohan Xi
,
Jinghui Chen
,
Shouling Ji
,
Ting Wang
On the Difficulty of Defending Contrastive Learning against Backdoor Attacks.
CoRR
(2023)
Hangfan Zhang
,
Zhimeng Guo
,
Huaisheng Zhu
,
Bochuan Cao
,
Lu Lin
,
Jinyuan Jia
,
Jinghui Chen
,
Dinghao Wu
On the Safety of Open-Sourced Large Language Models: Does Alignment Really Prevent Them From Being Misused?
CoRR
(2023)
Huaxiu Yao
,
Caroline Choi
,
Bochuan Cao
,
Yoonho Lee
,
Pang Wei Koh
,
Chelsea Finn
Wild-Time: A Benchmark of in-the-Wild Distribution Shift over Time.
NeurIPS
(2022)
Huaxiu Yao
,
Caroline Choi
,
Bochuan Cao
,
Yoonho Lee
,
Pang Wei Koh
,
Chelsea Finn
Wild-Time: A Benchmark of in-the-Wild Distribution Shift over Time.
CoRR
(2022)