Login / Signup
Ruibo Fu
ORCID
Publication Activity (10 Years)
Years Active: 2018-2024
Publications (10 Years): 67
Top Topics
Diffusion Model
Speech Synthesis
Multimedia
Autoregressive
Top Venues
CoRR
ICASSP
INTERSPEECH
IEEE ACM Trans. Audio Speech Lang. Process.
</>
Publications
</>
Shuchen Shi
,
Ruibo Fu
,
Zhengqi Wen
,
Jianhua Tao
,
Tao Wang
,
Chunyu Qiang
,
Yi Lu
,
Xin Qi
,
Xuefei Liu
,
Yukun Liu
,
Yongwei Li
,
Zhiyong Wang
,
Xiaopeng Wang
PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation.
CoRR
(2024)
Xiaopeng Wang
,
Ruibo Fu
,
Zhengqi Wen
,
Zhiyong Wang
,
Yuankun Xie
,
Yukun Liu
,
Jianhua Tao
,
Xuefei Liu
,
Yongwei Li
,
Xin Qi
,
Yi Lu
,
Shuchen Shi
Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection.
CoRR
(2024)
Cong Cai
,
Shan Liang
,
Xuefei Liu
,
Kang Zhu
,
Zhengqi Wen
,
Jianhua Tao
,
Heng Xie
,
Jizhou Cui
,
Yiming Ma
,
Zhenhua Cheng
,
Hanzhe Xu
,
Ruibo Fu
,
Bin Liu
,
Yongwei Li
MDPE: A Multimodal Deception Dataset with Personality and Emotional Characteristics.
CoRR
(2024)
Ruibo Fu
,
Xin Qi
,
Zhengqi Wen
,
Jianhua Tao
,
Tao Wang
,
Chunyu Qiang
,
Zhiyong Wang
,
Yi Lu
,
Xiaopeng Wang
,
Shuchen Shi
,
Yukun Liu
,
Xuefei Liu
,
Shuai Zhang
ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation.
CoRR
(2024)
Ruibo Fu
,
Shuchen Shi
,
Hongming Guo
,
Tao Wang
,
Chunyu Qiang
,
Zhengqi Wen
,
Jianhua Tao
,
Xin Qi
,
Yi Lu
,
Xiaopeng Wang
,
Zhiyong Wang
,
Yukun Liu
,
Xuefei Liu
,
Shuai Zhang
,
Guanjun Li
MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation.
CoRR
(2024)
Yuankun Xie
,
Ruibo Fu
,
Zhengqi Wen
,
Zhiyong Wang
,
Xiaopeng Wang
,
Haonan Cheng
,
Long Ye
,
Jianhua Tao
Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy.
CoRR
(2024)
Ruibo Fu
,
Rui Liu
,
Chunyu Qiang
,
Yingming Gao
,
Yi Lu
,
Shuchen Shi
,
Tao Wang
,
Ya Li
,
Zhengqi Wen
,
Chen Zhang
,
Hui Bu
,
Yukun Liu
,
Xin Qi
,
Guanjun Li
ICAGC 2024: Inspirational and Convincing Audio Generation Challenge 2024.
CoRR
(2024)
Yi Lu
,
Yuankun Xie
,
Ruibo Fu
,
Zhengqi Wen
,
Jianhua Tao
,
Zhiyong Wang
,
Xin Qi
,
Xuefei Liu
,
Yongwei Li
,
Yukun Liu
,
Xiaopeng Wang
,
Shuchen Shi
Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio.
CoRR
(2024)
Zhiyong Wang
,
Ruibo Fu
,
Zhengqi Wen
,
Yuankun Xie
,
Yukun Liu
,
Xiaopeng Wang
,
Xuefei Liu
,
Yongwei Li
,
Jianhua Tao
,
Yi Lu
,
Xin Qi
,
Shuchen Shi
Generalized Fake Audio Detection via Deep Stable Learning.
CoRR
(2024)
Ruihan Jin
,
Ruibo Fu
,
Zhengqi Wen
,
Shuai Zhang
,
Yukun Liu
,
Jianhua Tao
Fake News Detection and Manipulation Reasoning via Large Vision-Language Models.
CoRR
(2024)
Cunhang Fan
,
Mingming Ding
,
Jianhua Tao
,
Ruibo Fu
,
Jiangyan Yi
,
Zhengqi Wen
,
Zhao Lv
Dual-Branch Knowledge Distillation for Noise-Robust Synthetic Speech Detection.
IEEE ACM Trans. Audio Speech Lang. Process.
32 (2024)
Chunyu Qiang
,
Hao Li
,
Hao Ni
,
He Qu
,
Ruibo Fu
,
Tao Wang
,
Longbiao Wang
,
Jianwu Dang
Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding.
ICASSP
(2024)
Xiaopeng Wang
,
Yi Lu
,
Xin Qi
,
Zhiyong Wang
,
Yuankun Xie
,
Shuchen Shi
,
Ruibo Fu
A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge.
CoRR
(2024)
Tao Wang
,
Jiangyan Yi
,
Ruibo Fu
,
Jianhua Tao
,
Zhengqi Wen
,
Chu Yuan Zhang
Emotion selectable end-to-end text-based speech editing.
Artif. Intell.
329 (2024)
Yuankun Xie
,
Yi Lu
,
Ruibo Fu
,
Zhengqi Wen
,
Zhiyong Wang
,
Jianhua Tao
,
Xin Qi
,
Xiaopeng Wang
,
Yukun Liu
,
Haonan Cheng
,
Long Ye
,
Yi Sun
The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio.
CoRR
(2024)
Jiangyan Yi
,
Chenglong Wang
,
Jianhua Tao
,
Chuyuan Zhang
,
Cunhang Fan
,
Zhengkun Tian
,
Haoxin Ma
,
Ruibo Fu
SceneFake: An initial dataset and benchmarks for scene fake audio detection.
Pattern Recognit.
152 (2024)
Chunyu Qiang
,
Hao Li
,
Yixin Tian
,
Ruibo Fu
,
Tao Wang
,
Longbiao Wang
,
Jianwu Dang
Learning Speech Representation from Contrastive Token-Acoustic Pretraining.
ICASSP
(2024)
Chunyu Qiang
,
Hao Li
,
Yixin Tian
,
Ruibo Fu
,
Tao Wang
,
Longbiao Wang
,
Jianwu Dang
Learning Speech Representation From Contrastive Token-Acoustic Pretraining.
CoRR
(2023)
Andreas Triantafyllopoulos
,
Björn W. Schuller
,
Gökçe Iymen
,
Tevfik Metin Sezgin
,
Xiangheng He
,
Zijiang Yang
,
Panagiotis Tzirakis
,
Shuo Liu
,
Silvan Mertes
,
Elisabeth André
,
Ruibo Fu
,
Jianhua Tao
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era.
Proc. IEEE
111 (10) (2023)
Jiangyan Yi
,
Jianhua Tao
,
Ruibo Fu
,
Xinrui Yan
,
Chenglong Wang
,
Tao Wang
,
Chu Yuan Zhang
,
Xiaohui Zhang
,
Yan Zhao
,
Yong Ren
,
Le Xu
,
Junzuo Zhou
,
Hao Gu
,
Zhengqi Wen
,
Shan Liang
,
Zheng Lian
,
Shuai Nie
,
Haizhou Li
ADD 2023: the Second Audio Deepfake Detection Challenge.
CoRR
(2023)
Chenglong Wang
,
Jiangyan Yi
,
Xiaohui Zhang
,
Jianhua Tao
,
Xinrui Yan
,
Le Xu
,
Ruibo Fu
Low-rank Adaptation Method for Wav2vec2-based Fake Audio Detection.
DADA@IJCAI
(2023)
Haogeng Liu
,
Tao Wang
,
Ruibo Fu
,
Jiangyan Yi
,
Zhengqi Wen
,
Jianhua Tao
UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion.
CoRR
(2023)
Xiaohui Zhang
,
Jiangyan Yi
,
Jianhua Tao
,
Chenglong Wang
,
Le Xu
,
Ruibo Fu
Adaptive Fake Audio Detection with Low-Rank Model Squeezing.
DADA@IJCAI
(2023)
Chenglong Wang
,
Jiangyan Yi
,
Xiaohui Zhang
,
Jianhua Tao
,
Le Xu
,
Ruibo Fu
Low-rank Adaptation Method for Wav2vec2-based Fake Audio Detection.
CoRR
(2023)
Chunyu Qiang
,
Hao Li
,
Hao Ni
,
He Qu
,
Ruibo Fu
,
Tao Wang
,
Longbiao Wang
,
Jianwu Dang
Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding.
CoRR
(2023)
Jiangyan Yi
,
Jianhua Tao
,
Ruibo Fu
,
Xinrui Yan
,
Chenglong Wang
,
Tao Wang
,
Chu Yuan Zhang
,
Xiaohui Zhang
,
Yan Zhao
,
Yong Ren
,
Le Xu
,
Junzuo Zhou
,
Hao Gu
,
Zhengqi Wen
,
Shan Liang
,
Zheng Lian
,
Shuai Nie
,
Haizhou Li
ADD 2023: the Second Audio Deepfake Detection Challenge.
DADA@IJCAI
(2023)
Cunhang Fan
,
Mingming Ding
,
Jianhua Tao
,
Ruibo Fu
,
Jiangyan Yi
,
Zhengqi Wen
,
Zhao Lv
Learning to Behave Like Clean Speech: Dual-Branch Knowledge Distillation for Noise-Robust Fake Audio Detection.
CoRR
(2023)
Jiangyan Yi
,
Jianhua Tao
,
Ruibo Fu
,
Tao Wang
,
Chu Yuan Zhang
,
Chenglong Wang
Adversarial Multi-Task Learning for Mandarin Prosodic Boundary Prediction With Multi-Modal Embeddings.
IEEE ACM Trans. Audio Speech Lang. Process.
31 (2023)
Xiaohui Zhang
,
Jiangyan Yi
,
Jianhua Tao
,
Chenlong Wang
,
Le Xu
,
Ruibo Fu
Adaptive Fake Audio Detection with Low-Rank Model Squeezing.
CoRR
(2023)
Chenglong Wang
,
Jiangyan Yi
,
Jianhua Tao
,
Chuyuan Zhang
,
Shuai Zhang
,
Ruibo Fu
,
Xun Chen
TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection.
CoRR
(2023)
Chenglong Wang
,
Jiangyan Yi
,
Jianhua Tao
,
Chu Yuan Zhang
,
Shuai Zhang
,
Ruibo Fu
,
Xun Chen
TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection.
INTERSPEECH
(2023)
Andreas Triantafyllopoulos
,
Björn W. Schuller
,
Gökçe Iymen
,
Tevfik Metin Sezgin
,
Xiangheng He
,
Zijiang Yang
,
Panagiotis Tzirakis
,
Shuo Liu
,
Silvan Mertes
,
Elisabeth André
,
Ruibo Fu
,
Jianhua Tao
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era.
CoRR
(2022)
Xinrui Yan
,
Jiangyan Yi
,
Jianhua Tao
,
Chenglong Wang
,
Haoxin Ma
,
Tao Wang
,
Shiming Wang
,
Ruibo Fu
An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio.
CoRR
(2022)
Tao Wang
,
Ruibo Fu
,
Jiangyan Yi
,
Jianhua Tao
,
Zhengqi Wen
NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation.
IEEE ACM Trans. Audio Speech Lang. Process.
30 (2022)
Tao Wang
,
Jiangyan Yi
,
Liqun Deng
,
Ruibo Fu
,
Jianhua Tao
,
Zhengqi Wen
Context-Aware Mask Prediction Network for End-to-End Text-Based Speech Editing.
ICASSP
(2022)
Tao Wang
,
Ruibo Fu
,
Jiangyan Yi
,
Zhengqi Wen
,
Jianhua Tao
Singing-Tacotron: Global Duration Control Attention and Dynamic Filter for End-to-end Singing Voice Synthesis.
DDAM@MM
(2022)
Tao Wang
,
Ruibo Fu
,
Jiangyan Yi
,
Jianhua Tao
,
Zhengqi Wen
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation.
CoRR
(2022)
Xinrui Yan
,
Jiangyan Yi
,
Jianhua Tao
,
Chenglong Wang
,
Haoxin Ma
,
Tao Wang
,
Shiming Wang
,
Ruibo Fu
An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio.
DDAM@MM
(2022)
Jianhua Tao
,
Jiangyan Yi
,
Cunhang Fan
,
Ruibo Fu
,
Shan Liang
,
Pengyuan Zhang
,
Haizhou Li
,
Helen Meng
,
Dong Yu
,
Masato Akagi
DDAM '22: 1st International Workshop on Deepfake Detection for Audio Multimedia.
ACM Multimedia
(2022)
Haoxin Ma
,
Jiangyan Yi
,
Chenglong Wang
,
Xinrui Yan
,
Jianhua Tao
,
Tao Wang
,
Shiming Wang
,
Le Xu
,
Ruibo Fu
FAD: A Chinese Dataset for Fake Audio Detection.
CoRR
(2022)
Tao Wang
,
Ruibo Fu
,
Jiangyan Yi
,
Jianhua Tao
,
Zhengqi Wen
Singing-Tacotron: Global duration control attention and dynamic filter for End-to-end singing voice synthesis.
CoRR
(2022)
Jiangyan Yi
,
Chenglong Wang
,
Jianhua Tao
,
Zhengkun Tian
,
Cunhang Fan
,
Haoxin Ma
,
Ruibo Fu
SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection.
CoRR
(2022)
Chunyu Qiang
,
Jianhua Tao
,
Ruibo Fu
,
Zhengqi Wen
,
Jiangyan Yi
,
Tao Wang
,
Shiming Wang
Text Enhancement for Paragraph Processing in End-to-End Code-switching TTS.
CoRR
(2022)
Xinrui Yan
,
Jiangyan Yi
,
Jianhua Tao
,
Chenglong Wang
,
Haoxin Ma
,
Zhengkun Tian
,
Ruibo Fu
System Fingerprints Detection for DeepFake Audio: An Initial Dataset and Investigation.
CoRR
(2022)
Chenglong Wang
,
Jiangyan Yi
,
Jianhua Tao
,
Haiyang Sun
,
Xun Chen
,
Zhengkun Tian
,
Haoxin Ma
,
Cunhang Fan
,
Ruibo Fu
Fully Automated End-to-End Fake Audio Detection.
DDAM@MM
(2022)
Tao Wang
,
Jiangyan Yi
,
Ruibo Fu
,
Jianhua Tao
,
Zhengqi Wen
,
Chu Yuan Zhang
Emotion Selectable End-to-End Text-based Speech Editing.
CoRR
(2022)
Jiangyan Yi
,
Ruibo Fu
,
Jianhua Tao
,
Shuai Nie
,
Haoxin Ma
,
Chenglong Wang
,
Tao Wang
,
Zhengkun Tian
,
Ye Bai
,
Cunhang Fan
,
Shan Liang
,
Shiming Wang
,
Shuai Zhang
,
Xinrui Yan
,
Le Xu
,
Zhengqi Wen
,
Haizhou Li
ADD 2022: the first Audio Deep Synthesis Detection Challenge.
ICASSP
(2022)
Chenglong Wang
,
Jiangyan Yi
,
Jianhua Tao
,
Haiyang Sun
,
Xun Chen
,
Zhengkun Tian
,
Haoxin Ma
,
Cunhang Fan
,
Ruibo Fu
Fully Automated End-to-End Fake Audio Detection.
CoRR
(2022)
Jiangyan Yi
,
Ruibo Fu
,
Jianhua Tao
,
Shuai Nie
,
Haoxin Ma
,
Chenglong Wang
,
Tao Wang
,
Zhengkun Tian
,
Ye Bai
,
Cunhang Fan
,
Shan Liang
,
Shiming Wang
,
Shuai Zhang
,
Xinrui Yan
,
Le Xu
,
Zhengqi Wen
,
Haizhou Li
,
Zheng Lian
,
Bin Liu
ADD 2022: the First Audio Deep Synthesis Detection Challenge.
CoRR
(2022)
Tao Wang
,
Jiangyan Yi
,
Ruibo Fu
,
Jianhua Tao
,
Zhengqi Wen
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing.
CoRR
(2022)
Tao Wang
,
Jiangyan Yi
,
Ruibo Fu
,
Jianhua Tao
,
Zhengqi Wen
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing.
IEEE ACM Trans. Audio Speech Lang. Process.
30 (2022)
Tao Wang
,
Ruibo Fu
,
Jiangyan Yi
,
Jianhua Tao
,
Zhengqi Wen
,
Chunyu Qiang
,
Shiming Wang
Prosody and Voice Factorization for Few-Shot Speaker Adaptation in the Challenge M2voc 2021.
ICASSP
(2021)
Jiangyan Yi
,
Ye Bai
,
Jianhua Tao
,
Haoxin Ma
,
Zhengkun Tian
,
Chenglong Wang
,
Tao Wang
,
Ruibo Fu
Half-Truth: A Partially Fake Audio Detection Dataset.
Interspeech
(2021)
Chunyu Qiang
,
Jianhua Tao
,
Ruibo Fu
,
Zhengqi Wen
,
Jiangyan Yi
,
Tao Wang
,
Shiming Wang
Text Enhancement for Paragraph Processing in End-to-End Code-switching TTS.
ISCSLP
(2021)
Shiming Wang
,
Zhenhua Ling
,
Ruibo Fu
,
Jiangyan Yi
,
Jianhua Tao
Patnet : A Phoneme-Level Autoregressive Transformer Network for Speech Synthesis.
ICASSP
(2021)
Ruibo Fu
,
Jianhua Tao
,
Zhengqi Wen
,
Jiangyan Yi
,
Tao Wang
,
Chunyu Qiang
Bi-Level Style and Prosody Decoupling Modeling for Personalized End-to-End Speech Synthesis.
ICASSP
(2021)
Jiangyan Yi
,
Ye Bai
,
Jianhua Tao
,
Zhengkun Tian
,
Chenglong Wang
,
Tao Wang
,
Ruibo Fu
Half-Truth: A Partially Fake Audio Detection Dataset.
CoRR
(2021)
Tao Wang
,
Xuefei Liu
,
Jianhua Tao
,
Jiangyan Yi
,
Ruibo Fu
,
Zhengqi Wen
Non-Autoregressive End-to-End TTS with Coarse-to-Fine Decoding.
INTERSPEECH
(2020)
Ruibo Fu
,
Jianhua Tao
,
Zhengqi Wen
,
Jiangyan Yi
,
Tao Wang
Focusing on Attention: Prosody Transfer and Adaptative Optimization Strategy for Multi-Speaker End-to-End Speech Synthesis.
ICASSP
(2020)
Ruibo Fu
,
Jianhua Tao
,
Zhengqi Wen
,
Jiangyan Yi
,
Chunyu Qiang
,
Tao Wang
Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis.
INTERSPEECH
(2020)
Tao Wang
,
Jianhua Tao
,
Ruibo Fu
,
Jiangyan Yi
,
Zhengqi Wen
,
Rongxiu Zhong
Spoken Content and Voice Factorization for Few-Shot Speaker Adaptation.
INTERSPEECH
(2020)
Tao Wang
,
Jianhua Tao
,
Ruibo Fu
,
Jiangyan Yi
,
Zhengqi Wen
,
Chunyu Qiang
Bi-Level Speaker Supervision for One-Shot Speech Synthesis.
INTERSPEECH
(2020)
Ruibo Fu
,
Jianhua Tao
,
Zhengqi Wen
,
Jiangyan Yi
,
Tao Wang
,
Chunyu Qiang
Dynamic Speaker Representations Adjustment and Decoder Factorization for Speaker Adaptation in End-to-End Speech Synthesis.
INTERSPEECH
(2020)
Ruibo Fu
,
Jianhua Tao
,
Zhengqi Wen
,
Yibin Zheng
Phoneme Dependent Speaker Embedding and Model Factorization for Multi-speaker Speech Synthesis and Adaptation.
ICASSP
(2019)
Ruibo Fu
,
Jianhua Tao
,
Yibin Zheng
,
Zhengqi Wen
Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer.
INTERSPEECH
(2018)
Yibin Zheng
,
Jianhua Tao
,
Zhengqi Wen
,
Ruibo Fu
On the Application and Compression of Deep Time Delay Neural Network for Embedded Statistical Parametric Speech Synthesis.
INTERSPEECH
(2018)
Ruibo Fu
,
Jianhua Tao
,
Yibin Zheng
,
Zhengqi Wen
Transfer Learning Based Progressive Neural Networks for Acoustic Modeling in Statistical Parametric Speech Synthesis.
INTERSPEECH
(2018)