​
Login / Signup
Cong Han
ORCID
Publication Activity (10 Years)
Years Active: 2007-2024
Publications (10 Years): 61
Top Topics
Speech Synthesis
Language Model
Group Communication
Lightweight
Top Venues
CoRR
ICASSP
SLT
Interspeech
</>
Publications
</>
Hui Yu
,
Dongjie Peng
,
Cai Chen
,
Dongyan Chen
,
Cong Han
Probability-guaranteed distributed set-membership filtering over sensor networks: A stochastic communication protocol case.
Syst. Control. Lett.
188 (2024)
Cong Han
,
Kevin W. Wilson
,
Scott Wisdom
,
John R. Hershey
Unsupervised Multi-Channel Separation And Adaptation.
ICASSP
(2024)
Xilin Jiang
,
Cong Han
,
Nima Mesgarani
Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation.
CoRR
(2024)
Kang Zhao
,
Xinyu Zhao
,
Zhipeng Jin
,
Yi Yang
,
Wen Tao
,
Cong Han
,
Shuanglong Li
,
Lin Liu
Enhancing Baidu Multimodal Advertisement with Chinese Text-to-Image Generation via Bilingual Alignment and Caption Synthesis.
SIGIR
(2024)
Xilin Jiang
,
Cong Han
,
Yinghao Aaron Li
,
Nima Mesgarani
Exploring Self-supervised Contrastive Learning of Spatial Sound Event Representation.
ICASSP
(2024)
Xilin Jiang
,
Yinghao Aaron Li
,
Adrian Nicolas Florea
,
Cong Han
,
Nima Mesgarani
Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis.
CoRR
(2024)
Xilin Jiang
,
Cong Han
,
Yinghao Aaron Li
,
Nima Mesgarani
Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience.
CoRR
(2024)
Yinghao Aaron Li
,
Cong Han
,
Vinay S. Raghavan
,
Gavin Mischler
,
Nima Mesgarani
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models.
NeurIPS
(2023)
Cong Han
,
Nima Mesgarani
Online Binaural Speech Separation Of Moving Speakers With A Wavesplit Network.
ICASSP
(2023)
Weiwei Deng
,
Wei Du
,
Cong Han
Incorporating heterogeneous information in deep learning with informative meta-paths for community recommendations.
J. Inf. Sci.
49 (5) (2023)
Xilin Jiang
,
Cong Han
,
Yinghao Aaron Li
,
Nima Mesgarani
Exploring Self-Supervised Contrastive Learning of Spatial Sound Event Representation.
CoRR
(2023)
Jilei Zhou
,
Guanran Jiang
,
Wei Du
,
Cong Han
Profiling temporal learning interests with time-aware transformers and knowledge graph for online course recommendation.
Electron. Commer. Res.
23 (4) (2023)
Bingxin Luo
,
Ziming Kou
,
Cong Han
,
Juan Wu
,
Shaowei Liu
A Faster and Lighter Detection Method for Foreign Objects in Coal Mine Belt Conveyors.
Sensors
23 (14) (2023)
Yinghao Aaron Li
,
Cong Han
,
Xilin Jiang
,
Nima Mesgarani
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions.
CoRR
(2023)
Cong Han
,
Kevin W. Wilson
,
Scott Wisdom
,
John R. Hershey
Unsupervised Multi-channel Separation and Adaptation.
CoRR
(2023)
Cong Han
,
Nima Mesgarani
Online Binaural Speech Separation of Moving Speakers With a Wavesplit Network.
CoRR
(2023)
Yinghao Aaron Li
,
Cong Han
,
Nima Mesgarani
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs.
CoRR
(2023)
Cong Han
,
Vishal Choudhari
,
Yinghao Aaron Li
,
Nima Mesgarani
Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation.
CoRR
(2023)
Cong Han
,
Yujie Zhong
,
Dengjie Li
,
Kai Han
,
Lin Ma
Zero-Shot Semantic Segmentation with Decoupled One-Pass Network.
CoRR
(2023)
Yinghao Aaron Li
,
Cong Han
,
Xilin Jiang
,
Nima Mesgarani
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform.
CoRR
(2023)
Cong Han
,
Yujie Zhong
,
Dengjie Li
,
Kai Han
,
Lin Ma
Open-Vocabulary Semantic Segmentation with Decoupled One-Pass Network.
ICCV
(2023)
Yinghao Aaron Li
,
Cong Han
,
Xilin Jiang
,
Nima Mesgarani
Phoneme-Level Bert for Enhanced Prosody of Text-To-Speech with Grapheme Predictions.
ICASSP
(2023)
Cong Han
,
Vishal Choudhari
,
Yinghao Aaron Li
,
Nima Mesgarani
Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation.
EMBC
(2023)
Cong Han
,
Bin Wen
,
Xin Wan
Research on Named Entity Recognition of Laboratory Safety Knowledge based on Deep Learning.
ICBASE
(2023)
Yinghao Aaron Li
,
Cong Han
,
Vinay S. Raghavan
,
Gavin Mischler
,
Nima Mesgarani
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models.
CoRR
(2023)
Yinghao Aaron Li
,
Cong Han
,
Nima Mesgarani
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs.
WASPAA
(2023)
Yinghao Aaron Li
,
Cong Han
,
Nima Mesgarani
Styletts-VC: One-Shot Voice Conversion by Knowledge Transfer From Style-Based TTS Models.
SLT
(2022)
Yuhong Li
,
Jiajie Li
,
Cong Han
,
Pan Li
,
Jinjun Xiong
,
Deming Chen
Extensible Proxy for Efficient NAS.
CoRR
(2022)
Cong Han
,
Emine Merve Kaya
,
Kyle Hoefer
,
Malcolm Slaney
,
Simon Carlile
Multi-Channel Speech Denoising for Machine Ears.
CoRR
(2022)
Cong Han
,
Emine Merve Kaya
,
Kyle Hoefer
,
Malcolm Slaney
,
Simon Carlile
Multi-Channel Speech Denoising for Machine Ears.
ICASSP
(2022)
Bowen Yang
,
Cong Han
,
Yu Li
,
Lei Zuo
,
Zhou Yu
Improving Conversational Recommendation Systems' Quality with Context-Aware Item Meta-Information.
NAACL-HLT (Findings)
(2022)
Yinghao Aaron Li
,
Cong Han
,
Nima Mesgarani
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis.
CoRR
(2022)
Yinghao Aaron Li
,
Cong Han
,
Nima Mesgarani
StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models.
CoRR
(2022)
Yi Luo
,
Cong Han
,
Nima Mesgarani
Group Communication With Context Codec for Lightweight Source Separation.
IEEE ACM Trans. Audio Speech Lang. Process.
29 (2021)
Yi Luo
,
Zhuo Chen
,
Cong Han
,
Chenda Li
,
Tianyan Zhou
,
Nima Mesgarani
Rethinking The Separation Layers In Speech Separation Networks.
ICASSP
(2021)
Bowen Yang
,
Cong Han
,
Yu Li
,
Lei Zuo
,
Zhou Yu
Improving Conversational Recommendation Systems' Quality with Context-Aware Item Meta Information.
CoRR
(2021)
Yi Zhang
,
Keren Fu
,
Cong Han
,
Peng Cheng
Identity-and-pose-guided generative adversarial network for face rotation.
Neurocomputing
450 (2021)
Cong Han
,
Yi Luo
,
Nima Mesgarani
Binaural Speech Separation of Moving Speakers With Preserved Spatial Cues.
Interspeech
(2021)
Yi Luo
,
Cong Han
,
Nima Mesgarani
Distortion-Controlled Training for end-to-end Reverberant Speech Separation with Auxiliary Autoencoding Loss.
SLT
(2021)
Chenda Li
,
Zhuo Chen
,
Yi Luo
,
Cong Han
,
Tianyan Zhou
,
Keisuke Kinoshita
,
Marc Delcroix
,
Shinji Watanabe
,
Yanmin Qian
Dual-Path Modeling for Long Recording Speech Separation in Meetings.
ICASSP
(2021)
Chenda Li
,
Zhuo Chen
,
Yi Luo
,
Cong Han
,
Tianyan Zhou
,
Keisuke Kinoshita
,
Marc Delcroix
,
Shinji Watanabe
,
Yanmin Qian
Dual-Path Modeling for Long Recording Speech Separation in Meetings.
CoRR
(2021)
Yi Zhang
,
Keren Fu
,
Cong Han
,
Peng Cheng
,
Shanmin Yang
,
Xiao Yang
PGM-face: Pose-guided margin loss for cross-pose face recognition.
Neurocomputing
460 (2021)
Yi Luo
,
Cong Han
,
Nima Mesgarani
Empirical Analysis of Generalized Iterative Speech Separation Networks.
Interspeech
(2021)
Yi Luo
,
Cong Han
,
Nima Mesgarani
Ultra-Lightweight Speech Separation Via Group Communication.
ICASSP
(2021)
Chenda Li
,
Yi Luo
,
Cong Han
,
Jinyu Li
,
Takuya Yoshioka
,
Tianyan Zhou
,
Marc Delcroix
,
Keisuke Kinoshita
,
Christoph Böddeker
,
Yanmin Qian
,
Shinji Watanabe
,
Zhuo Chen
Dual-Path RNN for Long Recording Speech Separation.
SLT
(2021)
Cong Han
,
Yi Luo
,
Chenda Li
,
Tianyan Zhou
,
Keisuke Kinoshita
,
Shinji Watanabe
,
Marc Delcroix
,
Hakan Erdogan
,
John R. Hershey
,
Nima Mesgarani
,
Zhuo Chen
Continuous Speech Separation Using Speaker Inventory for Long Recording.
Interspeech
(2021)
Siya Xu
,
Boxian Liao
,
Bo Hu
,
Cong Han
,
Chao Yang
,
Zhili Wang
,
Ao Xiong
A Reliability-and-Energy-Balanced Service Function Chain Mapping and Migration Method for Internet of Things.
IEEE Access
8 (2020)
Yi Luo
,
Cong Han
,
Nima Mesgarani
Ultra-Lightweight Speech Separation via Group Communication.
CoRR
(2020)
Bin Li
,
Xiao Yang
,
Daren Sun
,
Zhi Ji
,
Zhen Jiang
,
Cong Han
,
Dong Hao
Incentive Mechanism Design for ROI-constrained Auto-bidding.
CoRR
(2020)
Cong Han
,
Yi Luo
,
Nima Mesgarani
Real-time binaural speech separation with preserved spatial cues.
CoRR
(2020)
Yi Luo
,
Cong Han
,
Nima Mesgarani
Group Communication with Context Codec for Ultra-Lightweight Source Separation.
CoRR
(2020)
Yi Luo
,
Zhuo Chen
,
Cong Han
,
Chenda Li
,
Tianyan Zhou
,
Nima Mesgarani
Rethinking the Separation Layers in Speech Separation Networks.
CoRR
(2020)
Cong Han
,
Yi Luo
,
Nima Mesgarani
Real-Time Binaural Speech Separation with Preserved Spatial Cues.
ICASSP
(2020)
Cong Han
,
Yi Luo
,
Chenda Li
,
Tianyan Zhou
,
Keisuke Kinoshita
,
Shinji Watanabe
,
Marc Delcroix
,
Hakan Erdogan
,
John R. Hershey
,
Nima Mesgarani
,
Zhuo Chen
Continuous Speech Separation Using Speaker Inventory for Long Multi-talker Recording.
CoRR
(2020)
Cong Han
,
Siya Xu
,
Shaoyong Guo
,
Xuesong Qiu
,
Ao Xiong
,
Peng Yu
,
Kunya Guo
,
Dong Guo
A Multi-objective Service Function Chain Mapping Mechanism for IoT networks.
IWCMC
(2019)
Wenchao Zuo
,
Hongbin Ma
,
Xin Wang
,
Cong Han
,
Zhuang Li
Co-simulation of Omnidirectional Mobile Platform Based on Fuzzy Control.
ICIRA (2)
(2019)
Liwei Guo
,
Zeyan Li
,
Jinhao Lyu
,
Yuqian Mei
,
John C. Vardakis
,
Duanduan Chen
,
Cong Han
,
Xin Lou
,
Yiannis Ventikos
On the Validation of a Multiple-Network Poroelastic Model Using Arterial Spin Labeling MRI Data.
Frontiers Comput. Neurosci.
13 (2019)
Yi Luo
,
Enea Ceolini
,
Cong Han
,
Shih-Chii Liu
,
Nima Mesgarani
FaSNet: Low-latency Adaptive Beamforming for Multi-microphone Audio Processing.
CoRR
(2019)
Xiaoshuai Guo
,
Yinxiong Lu
,
Jianye Wang
,
Cong Han
,
Guoan Yang
A Research Based on Log Current Spectrum for Experiment Approaches to Diagnosis of Motor Broken Rotor Bar Fault.
CISP-BMEI
(2019)
Cong Han
,
Yi Luo
,
Nima Mesgarani
Online Deep Attractor Network for Real-time Single-channel Speech Separation.
ICASSP
(2019)
Yi Luo
,
Cong Han
,
Nima Mesgarani
,
Enea Ceolini
,
Shih-Chii Liu
FaSNet: Low-Latency Adaptive Beamforming for Multi-Microphone Audio Processing.
ASRU
(2019)
Yunpeng Yan
,
Zhengmin He
,
Gang Liu
,
Yanzuo Wang
,
Cong Han
The National Entironmental and Geological Information System for Remote Sensing Survey and Monitoring.
IGARSS
(2015)
Cong Han
Experimental design for regression analysis when the responses are subject to censoring.
Comput. Methods Programs Biomed.
87 (2) (2007)