​
Login / Signup
Ming Li
ORCID
Publication Activity (10 Years)
Years Active: 2000-2024
Publications (10 Years): 98
Top Topics
Language Identification
Speaker Verification
Voice Activity Detection
Neural Network
Top Venues
CoRR
ICASSP
INTERSPEECH
IEEE ACM Trans. Audio Speech Lang. Process.
</>
Publications
</>
Danwei Cai
,
Ming Li
Leveraging ASR Pretrained Conformers for Speaker Verification Through Transfer Learning and Knowledge Distillation.
IEEE ACM Trans. Audio Speech Lang. Process.
32 (2024)
Zexin Cai
,
Ming Li
Integrating frame-level boundary detection and deepfake detection for locating manipulated regions in partially spoofed audio forgery attacks.
Comput. Speech Lang.
85 (2024)
Weiqing Wang
,
Danwei Cai
,
Ming Cheng
,
Ming Li
Joint Inference of Speaker Diarization and ASR with Multi-Stage Information Sharing.
ICASSP
(2024)
Xiaoyi Qin
,
Na Li
,
Shufei Duan
,
Ming Li
Investigating Long-Term and Short-Term Time-Varying Speaker Verification.
IEEE ACM Trans. Audio Speech Lang. Process.
32 (2024)
Zexin Cai
,
Ming Li
Invertible Voice Conversion with Parallel Data.
ICASSP
(2024)
Rongqi Bei
,
Yajie Liu
,
Yihe Wang
,
Yuxuan Huang
,
Ming Li
,
Yuhang Zhao
,
Xin Tong
StarRescue: the Design and Evaluation of A Turn-Taking Collaborative Game for Facilitating Autistic Children's Social Skills.
CHI
(2024)
Bang Zeng
,
Hongbin Suo
,
Yulong Wan
,
Ming Li
SEF-Net: Speaker Embedding Free Target Speaker Extraction Network.
INTERSPEECH
(2023)
Yu Hou
,
Cong Tran
,
Ming Li
,
Won-Yong Shin
Graph Neural Network-Aided Exploratory Learning for Community Detection with Unknown Topology.
CoRR
(2023)
Haoxu Wang
,
Fan Yu
,
Xian Shi
,
Yuezhang Wang
,
Shiliang Zhang
,
Ming Li
SlideSpeech: A Large-Scale Slide-Enriched Audio-Visual Corpus.
CoRR
(2023)
Danwei Cai
,
Zexin Cai
,
Ming Li
Identifying Source Speakers for Voice Conversion Based Spoofing Attacks on Speaker Verification Systems.
ICASSP
(2023)
Yucong Zhang
,
Hongbin Suo
,
Yulong Wan
,
Ming Li
Outlier-aware Inlier Modeling and Multi-scale Scoring for Anomalous Sound Detection via Multitask Learning.
INTERSPEECH
(2023)
Xiaoyi Qin
,
Xingming Wang
,
Yanli Chen
,
Qinglin Meng
,
Ming Li
From Speaker Verification to Deepfake Algorithm Recognition: Our Learned Lessons from ADD2023 Track 3.
DADA@IJCAI
(2023)
Weicong Chen
,
Dong Zhang
,
Ming Li
,
Dah-Jye Lee
STCAM: Spatial-Temporal and Channel Attention Module for Dynamic Facial Expression Recognition.
IEEE Trans. Affect. Comput.
14 (1) (2023)
Zhesi Zhu
,
Dong Zhang
,
Cailong Chi
,
Ming Li
,
Dah-Jye Lee
A Complementary Dual-Branch Network for Appearance-Based Gaze Estimation From Low-Resolution Facial Image.
IEEE Trans. Cogn. Dev. Syst.
15 (3) (2023)
Xingming Wang
,
Bang Zeng
,
Hongbin Suo
,
Yulong Wan
,
Ming Li
Robust Audio Anti-spoofing Countermeasure with Joint Training of Front-end and Back-end Models.
INTERSPEECH
(2023)
Haoxu Wang
,
Ming Cheng
,
Qiang Fu
,
Ming Li
The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis.
ICASSP
(2023)
Jianing Teng
,
Dong Zhang
,
Wei Zou
,
Ming Li
,
Dah-Jye Lee
Typical Facial Expression Network Using a Facial Feature Decoupler and Spatial-Temporal Learning.
IEEE Trans. Affect. Comput.
14 (2) (2023)
Yucong Zhang
,
Hongbin Suo
,
Yulong Wan
,
Ming Li
Outlier-aware Inlier Modeling and Multi-scale Scoring for Anomalous Sound Detection via Multitask Learning.
CoRR
(2023)
Ming Cheng
,
Yingying Zhang
,
Yixiang Xie
,
Yueran Pan
,
Xiao Li
,
Wenxing Liu
,
Chengyan Yu
,
Dong Zhang
,
Yu Xing
,
Xiaoqian Huang
,
Fang Wang
,
Cong You
,
Yuanyuan Zou
,
Yuchong Liu
,
Fengjing Liang
,
Huilin Zhu
,
Chun Tang
,
Hongzhu Deng
,
Xiaobing Zou
,
Ming Li
Computer-Aided Autism Spectrum Disorder Diagnosis With Behavior Signal Processing.
IEEE Trans. Affect. Comput.
14 (4) (2023)
Zexin Cai
,
Yaogen Yang
,
Ming Li
Cross-lingual multi-speaker speech synthesis with limited bilingual training data.
Comput. Speech Lang.
77 (2023)
Yaogen Yang
,
Haozhe Zhang
,
Zexin Cai
,
Yao Shi
,
Ming Li
,
Dong Zhang
,
Xiaojun Ding
,
Jianhua Deng
,
Jie Wang
Electrolaryngeal speech enhancement based on a two stage framework with bottleneck feature refinement and voice conversion.
Biomed. Signal Process. Control.
80 (Part) (2023)
Xiaoyi Qin
,
Danwei Cai
,
Ming Li
Robust Multi-Channel Far-Field Speaker Verification Under Different In-Domain Data Availability Scenarios.
IEEE ACM Trans. Audio Speech Lang. Process.
31 (2023)
Wenxing Liu
,
Ming Cheng
,
Yueran Pan
,
Lynn Yuan
,
Suxiu Hu
,
Ming Li
,
Songtian Zeng
Assessing the Social Skills of Children with Autism Spectrum Disorder via Language-Image Pre-training Models.
PRCV (13)
(2023)
Xiao Li
,
Dong Zhang
,
Ming Li
,
Dah-Jye Lee
Accurate Head Pose Estimation Using Image Rectification and a Lightweight Convolutional Neural Network.
IEEE Trans. Multim.
25 (2023)
Weiqing Wang
,
Xiaoyi Qin
,
Ming Li
Cross-Channel Attention-Based Target Speaker Voice Activity Detection: Experimental Results for the M2met Challenge.
ICASSP
(2022)
Xiaoyi Qin
,
Na Li
,
Chao Weng
,
Dan Su
,
Ming Li
Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings.
CoRR
(2022)
Yikang Wang
,
Xingming Wang
,
Hiromitsu Nishizaki
,
Ming Li
Low Pass Filtering and Bandwidth Extension for Robust Anti-spoofing Countermeasure Against Codec Variabilities.
CoRR
(2022)
Yucong Zhang
,
Qingjian Lin
,
Weiqing Wang
,
Lin Yang
,
Xuyang Wang
,
Junjie Wang
,
Ming Li
Low-Latency Online Speaker Diarization with Graph-Based Label Generation.
Odyssey
(2022)
Yanze Xu
,
Weiqing Wang
,
Huahua Cui
,
Mingyang Xu
,
Ming Li
Paralinguistic singing attribute recognition using supervised machine learning for describing the classical tenor solo singing voice in vocal pedagogy.
EURASIP J. Audio Speech Music. Process.
2022 (1) (2022)
Hua Hua
,
Ziyi Chen
,
Yuxiang Zhang
,
Ming Li
,
Pengyuan Zhang
Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion.
DDAM@MM
(2022)
Haozhe Zhang
,
Zexin Cai
,
Xiaoyi Qin
,
Ming Li
SIG-VC: A Speaker Information Guided Zero-Shot Voice Conversion System for Both Human Beings and Machines.
ICASSP
(2022)
Zexin Cai
,
Weiqing Wang
,
Ming Li
Waveform Boundary Detection for Partially Spoofed Audio.
CoRR
(2022)
Xiaoyi Qin
,
Na Li
,
Chao Weng
,
Dan Su
,
Ming Li
Simple Attention Module Based Speaker Verification with Iterative Noisy Label Detection.
ICASSP
(2022)
Xingming Wang
,
Xiaoyi Qin
,
Yikang Wang
,
Yunfei Xu
,
Ming Li
The DKU-OPPO System for the 2022 Spoofing-Aware Speaker Verification Challenge.
INTERSPEECH
(2022)
Danwei Cai
,
Weiqing Wang
,
Ming Li
Incorporating Visual Information in Audio Based Self-Supervised Speaker Recognition.
IEEE ACM Trans. Audio Speech Lang. Process.
30 (2022)
Weiqing Wang
,
Qingjian Lin
,
Danwei Cai
,
Ming Li
Similarity Measurement of Segment-Level Speaker Embeddings in Speaker Diarization.
IEEE ACM Trans. Audio Speech Lang. Process.
30 (2022)
Xiaoyi Qin
,
Na Li
,
Yuke Lin
,
Yiwei Ding
,
Chao Weng
,
Dan Su
,
Ming Li
The DKU-Tencent System for the VoxCeleb Speaker Recognition Challenge 2022.
CoRR
(2022)
Weiqing Wang
,
Qingjian Lin
,
Ming Li
Online Target Speaker Voice Activity Detection for Speaker Diarization.
CoRR
(2022)
Weiqing Wang
,
Ming Li
,
Qingjian Lin
Online Target Speaker Voice Activity Detection for Speaker Diarization.
INTERSPEECH
(2022)
Yikang Wang
,
Xingming Wang
,
Hiromitsu Nishizaki
,
Ming Li
Low Pass Filtering and Bandwidth Extension for Robust Anti-spoofing Countermeasure Against Codec Variabilities.
ISCSLP
(2022)
Qingjian Li
,
Lin Yang
,
Xuyang Wang
,
Xiaoyi Qin
,
Junjie Wang
,
Ming Li
Towards Lightweight Applications: Asymmetric Enroll-Verify Structure for Speaker Verification.
ICASSP
(2022)
Danwei Cai
,
Zexin Cai
,
Ming Li
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems.
CoRR
(2022)
Haoxu Wang
,
Yan Jia
,
Zeqing Zhao
,
Xuyang Wang
,
Junjie Wang
,
Ming Li
Generating TTS Based Adversarial Samples for Training Wake-Up Word Detection Systems Against Confusing Words.
Odyssey
(2022)
Weiqing Wang
,
Ming Li
Incorporating End-to-End Framework Into Target-Speaker Voice Activity Detection.
ICASSP
(2022)
Yuke Lin
,
Xiaoyi Qin
,
Huahua Cui
,
Zhenyi Zhu
,
Ming Li
Laugh Betrays You? Learning Robust Speaker Representation From Speech Containing Non-Verbal Fragments.
CoRR
(2022)
Xiaoyi Qin
,
Na Li
,
Chao Weng
,
Dan Su
,
Ming Li
Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings.
INTERSPEECH
(2022)
Yuxiang Zhang
,
Jingze Lu
,
Xingming Wang
,
Zhuo Li
,
Runqiu Xiao
,
Wenchao Wang
,
Ming Li
,
Pengyuan Zhang
Deepfake Detection System for the ADD Challenge Track 3.2 Based on Score Fusion.
DDAM@MM
(2022)
Ming Cheng
,
Haoxu Wang
,
Yechen Wang
,
Ming Li
The DKU Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge.
ICASSP
(2022)
Xingming Wang
,
Xiaoyi Qin
,
Yikang Wang
,
Yunfei Xu
,
Ming Li
The DKU-OPPO System for the 2022 Spoofing-Aware Speaker Verification Challenge.
CoRR
(2022)
Weiqing Wang
,
Jin Pan
,
Hua Yi
,
Zhanmei Song
,
Ming Li
Audio-Based Piano Performance Evaluation for Beginners With Convolutional Neural Network and Attention Mechanism.
IEEE ACM Trans. Audio Speech Lang. Process.
29 (2021)
Ran Ju
,
Huangrui Chu
,
Yechen Wang
,
Qi Deng
,
Ming Cheng
,
Ming Li
A Multimodal Dynamic Neural Network for Call for Help Recognition in Elevators.
ICMI Companion
(2021)
Huangrui Chu
,
Yechen Wang
,
Ran Ju
,
Yan Jia
,
Haoxu Wang
,
Ming Li
,
Qi Deng
Call For Help Detection In Emergent Situations Using Keyword Spotting And Paralinguistic Analysis.
ICMI Companion
(2021)
Tinglong Zhu
,
Xiaoyi Qin
,
Ming Li
Binary Neural Network for Speaker Verification.
Interspeech
(2021)
Yan Jia
,
Xingming Wang
,
Xiaoyi Qin
,
Yinping Zhang
,
Xuyang Wang
,
Junjie Wang
,
Dong Zhang
,
Ming Li
The 2020 Personalized Voice Trigger Challenge: Open Datasets, Evaluation Metrics, Baseline System and Results.
Interspeech
(2021)
Weiqing Wang
,
Danwei Cai
,
Jin Wang
,
Qingjian Lin
,
Xuyang Wang
,
Mi Hong
,
Ming Li
The DKU-Duke-Lenovo System Description for the Fearless Steps Challenge Phase III.
Interspeech
(2021)
Xiaoyi Qin
,
Chao Wang
,
Yong Ma
,
Min Liu
,
Shilei Zhang
,
Ming Li
Our Learned Lessons from Cross-Lingual Speaker Verification: The CRMI-DKU System Description for the Short-Duration Speaker Verification Challenge 2021.
Interspeech
(2021)
Wenbo Liu
,
Ming Li
,
Xiaobing Zou
,
Bhiksha Raj
Discriminative Dictionary Learning for Autism Spectrum Disorder Identification.
Frontiers Comput. Neurosci.
15 (2021)
Ming Li
,
Hao Xu
,
Xingchang Huang
,
Zhanmei Song
,
Xiaolin Liu
,
Xin Li
Facial Expression Recognition with Identity and Emotion Joint Learning.
IEEE Trans. Affect. Comput.
12 (2) (2021)
Xiaoyi Qin
,
Ming Li
,
Hui Bu
,
Rohan Kumar Das
,
Wei Rao
,
Shrikanth Narayanan
,
Haizhou Li
The FFSVC 2020 Evaluation Plan.
CoRR
(2020)
Haiwei Wu
,
Yan Jia
,
Yuanfei Nie
,
Ming Li
Mutli-task Learning with Alignment Loss for Far-field Small-Footprint Keyword Spotting.
CoRR
(2020)
Weicheng Cai
,
Jinkun Chen
,
Jun Zhang
,
Ming Li
On-the-Fly Data Loader and Utterance-Level Aggregation for Speaker and Language Recognition.
IEEE ACM Trans. Audio Speech Lang. Process.
28 (2020)
Danwei Cai
,
Weicheng Cai
,
Ming Li
Within-sample variability-invariant loss for robust speaker recognition under noisy environments.
CoRR
(2020)
Sheng Sun
,
Shuangmei Li
,
Wenbo Liu
,
Xiaobing Zou
,
Ming Li
Fixation Based Object Recognition in Autism Clinic Setting.
ICIRA (4)
(2019)
Haiwei Wu
,
Weicheng Cai
,
Ming Li
,
Ji Gao
,
Shanshan Zhang
,
Zhiqiang Lyu
,
Shen Huang
DKU-Tencent Submission to Oriental Language Recognition AP18-OLR Challenge.
APSIPA
(2019)
Weicheng Cai
,
Haiwei Wu
,
Danwei Cai
,
Ming Li
The DKU Replay Detection System for the ASVspoof 2019 Challenge: On Data Augmentation, Feature Representation, Classification, and Fusion.
CoRR
(2019)
Weicheng Cai
,
Haiwei Wu
,
Danwei Cai
,
Ming Li
The DKU Replay Detection System for the ASVspoof 2019 Challenge: On Data Augmentation, Feature Representation, Classification, and Fusion.
INTERSPEECH
(2019)
Zexin Cai
,
Zhicheng Xu
,
Ming Li
F0 Contour Estimation Using Phonetic Feature in Electrolaryngeal Speech Enhancement.
ICASSP
(2019)
Weicheng Cai
,
Danwei Cai
,
Shen Huang
,
Ming Li
Utterance-level end-to-end language identification using attention-based CNN-BLSTM.
CoRR
(2019)
Weicheng Cai
,
Danwei Cai
,
Shen Huang
,
Ming Li
Utterance-level End-to-end Language Identification Using Attention-based CNN-BLSTM.
ICASSP
(2019)
Zexin Cai
,
Yaogen Yang
,
Chuxiong Zhang
,
Xiaoyi Qin
,
Ming Li
Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features.
CoRR
(2019)
Weiqing Wang
,
Haiwei Wu
,
Ming Li
Deep Neural Networks with Batch Speaker Normalization for Intoxicated Speech Detection.
APSIPA
(2019)
Zhicheng Li
,
Bin Hu
,
Ming Li
,
Gengnan Luo
String Stability Analysis for Vehicle Platooning Under Unreliable Communication Links With Event-Triggered Strategy.
IEEE Trans. Veh. Technol.
68 (3) (2019)
Danwei Cai
,
Zexin Cai
,
Ming Li
Deep Speaker Embeddings with Convolutional Neural Network on Supervector for Text-Independent Speaker Recognition.
APSIPA
(2018)
Weicheng Cai
,
Zexin Cai
,
Wenbo Liu
,
Xiaoqi Wang
,
Ming Li
Insights in-to-End Learning Scheme for Language Identification.
ICASSP
(2018)
Weicheng Cai
,
Zexin Cai
,
Xiang Zhang
,
Xiaoqi Wang
,
Ming Li
A Novel Learnable Dictionary Encoding Layer for End-to-End Language Identification.
ICASSP
(2018)
Weicheng Cai
,
Jinkun Chen
,
Ming Li
Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System.
CoRR
(2018)
Haiwei Wu
,
Ming Li
,
Zexin Cai
,
Haibin Zhong
Unsupervised query by example spoken term detection using features concatenated with Self-Organizing Map distances.
ISCSLP
(2018)
Jinkun Chen
,
Weicheng Cai
,
Danwei Cai
,
Zexin Cai
,
Haibin Zhong
,
Ming Li
End-to-end Language Identification using NetFV and NetVLAD.
ISCSLP
(2018)
Weicheng Cai
,
Jinkun Chen
,
Ming Li
Analysis of Length Normalization in End-to-End Speaker Verification System.
INTERSPEECH
(2018)
Weicheng Cai
,
Zexin Cai
,
Wenbo Liu
,
Xiaoqi Wang
,
Ming Li
Insights into End-to-End Learning Scheme for Language Identification.
CoRR
(2018)
Weicheng Cai
,
Jinkun Chen
,
Ming Li
Analysis of Length Normalization in End-to-End Speaker Verification System.
CoRR
(2018)
Zexin Cai
,
Xiaoyi Qin
,
Danwei Cai
,
Ming Li
,
Xinzhong Liu
,
Haibin Zhong
The DKU-JNU-EMA Electromagnetic Articulography Database on Mandarin and Chinese Dialects with Tandem Feature based Acoustic-to-Articulatory Inversion.
ISCSLP
(2018)
Kong-Yik Chee
,
Zhe Jin
,
Danwei Cai
,
Ming Li
,
Wun-She Yap
,
Yen-Lung Lai
,
Bok-Min Goi
Cancellable speech template via random binary orthogonal matrices projection hashing.
Pattern Recognit.
76 (2018)
Weicheng Cai
,
Zexin Cai
,
Xiang Zhang
,
Xiaoqi Wang
,
Ming Li
A Novel Learnable Dictionary Encoding Layer for End-to-End Language Identification.
CoRR
(2018)
Jinkun Chen
,
Weicheng Cai
,
Danwei Cai
,
Zexin Cai
,
Haibin Zhong
,
Ming Li
End-to-end Language Identification using NetFV and NetVLAD.
CoRR
(2018)
Weicheng Cai
,
Jinkun Chen
,
Ming Li
Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System.
Odyssey
(2018)
Weicheng Cai
,
Danwei Cai
,
Wenbo Liu
,
Gang Li
,
Ming Li
Countermeasures for Automatic Speaker Verification Replay Spoofing Attack : On Data Augmentation, Feature Representation, Classification and Fusion.
INTERSPEECH
(2017)
Danwei Cai
,
Zhidong Ni
,
Wenbo Liu
,
Weicheng Cai
,
Gang Li
,
Ming Li
End-to-End Deep Learning Framework for Speech Paralinguistics Detection Based on Perception Aware Spectrum.
INTERSPEECH
(2017)
Wenbo Liu
,
Tianyan Zhou
,
Chenghao Zhang
,
Xiaobing Zou
,
Ming Li
Response to name: A dataset and a multimodal machine learning framework towards autism study.
ACII
(2017)
Weiyang Liu
,
Yandong Wen
,
Zhiding Yu
,
Ming Li
,
Bhiksha Raj
,
Le Song
SphereFace: Deep Hypersphere Embedding for Face Recognition.
CVPR
(2017)
Weiyang Liu
,
Yandong Wen
,
Zhiding Yu
,
Ming Li
,
Bhiksha Raj
,
Le Song
SphereFace: Deep Hypersphere Embedding for Face Recognition.
CoRR
(2017)
Ming Li
,
Luting Wang
,
Zhicheng Xu
,
Danwei Cai
Mandarin electrolaryngeal voice conversion with combination of Gaussian mixture model and non-negative matrix factorization.
APSIPA
(2017)
Ming Li
,
Lun Liu
,
Weicheng Cai
,
Wenbo Liu
Generalized I-vector Representation with Phonetic Tokenizations and Tandem Features for both Text Independent and Text Dependent Speaker Verification.
J. Signal Process. Syst.
82 (2) (2016)
Ming Li
,
Jangwon Kim
,
Adam C. Lammert
,
Prasanta Kumar Ghosh
,
Vikram Ramanarayanan
,
Shrikanth S. Narayanan
Speaker verification based on the fusion of speech acoustics and inverted articulatory signals.
Comput. Speech Lang.
36 (2016)
Huadi Zheng
,
Weicheng Cai
,
Tianyan Zhou
,
Shilei Zhang
,
Ming Li
Text-independent voice conversion using deep neural network based phonetic level features.
ICPR
(2016)
Tianyan Zhou
,
Weicheng Cai
,
Xiaoyan Chen
,
Xiaobing Zou
,
Shilei Zhang
,
Ming Li
Speaker diarization system for autism children's real-life audio data.
ISCSLP
(2016)
Danwei Cai
,
Weicheng Cai
,
Zhidong Ni
,
Ming Li
Locality sensitive discriminant analysis for speaker verification.
APSIPA
(2016)
Zhiding Yu
,
Weiyang Liu
,
Wenbo Liu
,
Yingzhen Yang
,
Ming Li
,
B. V. K. Vijaya Kumar
On Order-Constrained Transitive Distance Clustering.
AAAI
(2016)
Wenbo Liu
,
Li Yi
,
Zhiding Yu
,
Xiaobing Zou
,
Bhiksha Raj
,
Ming Li
Efficient autism spectrum disorder prediction with eye movement: A machine learning framework.
ACII
(2015)
Yingxue Wang
,
Shenghui Zhao
,
Wenbo Liu
,
Ming Li
,
Jingming Kuang
Speech bandwidth expansion based on deep neural networks.
INTERSPEECH
(2015)