Login / Signup
INTERSPEECH
2000
2005
2015
2023
2000
2023
Keyphrases
Publications
2023
Chihiro Taguchi
,
Yusuke Sakai
,
Parisa Haghani
,
David Chiang
Universal Automatic Phonetic Transcription into the International Phonetic Alphabet.
INTERSPEECH
(2023)
Hyungshin Ryu
,
Sunhee Kim
,
Minhwa Chung
A Joint Model for Pronunciation Assessment and Mispronunciation Detection and Diagnosis with Multi-task Learning.
INTERSPEECH
(2023)
Mengao Zhang
,
Ke Xu
,
Hao Li
,
Lei Wang
,
Chengfang Fang
,
Jie Shi
DoubleDeceiver: Deceiving the Speaker Verification System Protected by Spoofing Countermeasures.
INTERSPEECH
(2023)
Xiaoheng Zhang
,
Yang Li
A Dual Attention-based Modality-Collaborative Fusion Network for Emotion Recognition.
INTERSPEECH
(2023)
Lantian Li
,
Xiaolou Li
,
Haoyu Jiang
,
Chen Chen
,
Ruihai Hou
,
Dong Wang
CN-Celeb-AV: A Multi-Genre Audio-Visual Dataset for Person Recognition.
INTERSPEECH
(2023)
Matthew Baas
,
Benjamin van Niekerk
,
Herman Kamper
Voice Conversion With Just Nearest Neighbors.
INTERSPEECH
(2023)
Aoi Ito
,
Shota Horiguchi
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model.
INTERSPEECH
(2023)
Jasmin Pöhnlein
,
Felicitas Kleber
The emergence of obstruent-intrinsic f0 and VOT as cues to the fortis/lenis contrast in West Central Bavarian.
INTERSPEECH
(2023)
Zhiheng Liao
,
Feifei Xiong
,
Juan Luo
,
Minjie Cai
,
Eng Siong Chng
,
Jinwei Feng
,
Xionghu Zhong
Blind Estimation of Room Impulse Response from Monaural Reverberant Speech with Segmental Generative Neural Network.
INTERSPEECH
(2023)
Wangyou Zhang
,
Yanmin Qian
Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition.
INTERSPEECH
(2023)
Mana Ihori
,
Hiroshi Sato
,
Tomohiro Tanaka
,
Ryo Masumura
,
Saki Mizuno
,
Nobukatsu Hojo
Transcribing Speech as Spoken and Written Dual Text Using an Autoregressive Model.
INTERSPEECH
(2023)
Minghan Wang
,
Yinglu Li
,
Jiaxin Guo
,
Xiaosong Qiao
,
Zongyao Li
,
Hengchao Shang
,
Daimeng Wei
,
Shimin Tao
,
Min Zhang
,
Hao Yang
WhiSLU: End-to-End Spoken Language Understanding with Whisper.
INTERSPEECH
(2023)
Tzu-Han Zoe Cheng
,
Kuan-Lin Chen
,
Juliane Schubert
,
Ya-Ping Chen
,
Tim Brown
,
John Iversen
Similar Hierarchical Representation of Speech and Other Complex Sounds In the Brain and Deep Residual Networks: An MEG Study.
INTERSPEECH
(2023)
Fuma Kurata
,
Mao Saeki
,
Shinya Fujie
,
Yoichi Matsuyama
Multimodal Turn-Taking Model Using Visual Cues for End-of-Utterance Prediction in Spoken Dialogue Systems.
INTERSPEECH
(2023)
Jee-weon Jung
,
Soonshin Seo
,
Hee-Soo Heo
,
Geonmin Kim
,
You Jin Kim
,
Youngki Kwon
,
Minjae Lee
,
Bong-Jin Lee
Encoder-decoder Multimodal Speaker Change Detection.
INTERSPEECH
(2023)
Chu-Xiao Zuo
,
Jia-Yi Leng
,
Wu-Jun Li
Fooling Speaker Identification Systems with Adversarial Background Music.
INTERSPEECH
(2023)
Cong-Thanh Do
,
Rama Doddipatla
,
Mohan Li
,
Thomas Hain
Domain Adaptive Self-supervised Training of Automatic Speech Recognition.
INTERSPEECH
(2023)
Vladimir Kondratenko
,
Nikolay Karpov
,
Artem Sokolov
,
Nikita Savushkin
,
Oleg Kutuzov
,
Fyodor Minkin
Hybrid Dataset for Speech Emotion Recognition in Russian Language.
INTERSPEECH
(2023)
Denise Moussa
,
Germans Hirsch
,
Sebastian Wankerl
,
Christian Riess
Point to the Hidden: Exposing Speech Audio Splicing via Signal Pointer Nets.
INTERSPEECH
(2023)
Mohammad Arvan
,
A. Seza Dogruöz
,
Natalie Parde
Investigating Reproducibility at Interspeech Conferences: A Longitudinal and Comparative Perspective.
INTERSPEECH
(2023)
Yong-Hyeok Lee
,
Namhyun Cho
PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords.
INTERSPEECH
(2023)
Shu-Chuan Tseng
,
Yi-Fen Liu
,
Xiang-Li Lu
Model-assisted Lexical Tone Evaluation of three-year-old Chinese-speaking Children by also Considering Segment Production.
INTERSPEECH
(2023)
Hiuching Hung
,
Paula Andrea Pérez-Toro
,
Tomás Arias-Vergara
,
Andreas Maier
,
Elmar Nöth
Speaking Clearly, Understanding Better: Predicting the L2 Narrative Comprehension of Chinese Bilingual Kindergarten Children Based on Speech Intelligibility Using a Machine Learning Approach.
INTERSPEECH
(2023)
Vishwanath Pratap Singh
,
Md. Sahidullah
,
Tomi Kinnunen
Speaker Verification Across Ages: Investigating Deep Speaker Embedding Sensitivity to Age Mismatch in Enrollment and Test Speech.
INTERSPEECH
(2023)
Hokuto Munakata
,
Ryu Takeda
,
Kazunori Komatani
Recursive Sound Source Separation with Deep Learning-based Beamforming for Unknown Number of Sources.
INTERSPEECH
(2023)
Shun Takahashi
,
Sakriani Sakti
Unsupervised Learning of Discrete Latent Representations with Data-Adaptive Dimensionality from Continuous Speech Streams.
INTERSPEECH
(2023)
Shashi Kant Gupta
,
Sushant Hiray
,
Prashant Kukde
Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech.
INTERSPEECH
(2023)
Sun-Kyung Lee
,
Jong-Hwan Kim
Video Multimodal Emotion Recognition System for Real World Applications.
INTERSPEECH
(2023)
Gaëlle Laperrière
,
Ha Nguyen
,
Sahar Ghannay
,
Bassam Jabaian
,
Yannick Estève
Semantic Enrichment Towards Efficient Speech Representations.
INTERSPEECH
(2023)
Ammar Abbas
,
Sri Karlapati
,
Bastian Schnell
,
Penny Karanasou
,
Marcel Granero Moya
,
Amith Nagaraj
,
Ayman Boustati
,
Nicole Peinelt
,
Alexis Moinet
,
Thomas Drugman
eCat: An End-to-End Model for Multi-Speaker TTS & Many-to-Many Fine-Grained Prosody Transfer.
INTERSPEECH
(2023)
Vladimir Bataev
,
Roman Korostik
,
Evgeny Shabalin
,
Vitaly Lavrukhin
,
Boris Ginsburg
Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator.
INTERSPEECH
(2023)
Vahid Ahmadi Kalkhorani
,
Anurag Kumar
,
Ke Tan
,
Buye Xu
,
DeLiang Wang
Time-domain Transformer-based Audiovisual Speaker Separation.
INTERSPEECH
(2023)
Mísa Hejná
,
Adèle Jatteau
Aberystwyth English Pre-aspiration in Apparent Time.
INTERSPEECH
(2023)
Mahsa Kadkhodaei Elyaderani
,
Shahram Shirani
Sequence-to-Sequence Multi-Modal Speech In-Painting.
INTERSPEECH
(2023)
Ying Shi
,
Dong Wang
,
Lantian Li
,
Jiqing Han
,
Shi Yin
Spot Keywords From Very Noisy and Mixed Speech.
INTERSPEECH
(2023)
Delphine Charuau
,
Béatrice Vaxelaire
,
Rudolph Sock
Speech Breathing Behavior During Pauses in Children.
INTERSPEECH
(2023)
Ranzo Huang
,
Brian Mak
wav2vec 2.0 ASR for Cantonese-Speaking Older Adults in a Clinical Setting.
INTERSPEECH
(2023)
Soroosh Tayebi Arasteh
,
Cristian David Ríos-Urrego
,
Elmar Nöth
,
Andreas Maier
,
Seung Hee Yang
,
Jan Rusz
,
Juan Rafael Orozco-Arroyave
Federated Learning for Secure Development of AI Models for Parkinson's Disease Detection Using Speech from Different Languages.
INTERSPEECH
(2023)
Chenglong Wang
,
Jiangyan Yi
,
Jianhua Tao
,
Chu Yuan Zhang
,
Shuai Zhang
,
Xun Chen
Detection of Cross-Dataset Fake Audio Based on Prosodic and Pronunciation Features.
INTERSPEECH
(2023)
Yuki Okamoto
,
Kanta Shimonishi
,
Keisuke Imoto
,
Kota Dohi
,
Shota Horiguchi
,
Yohei Kawaguchi
CAPTDURE: Captioned Sound Dataset of Single Sources.
INTERSPEECH
(2023)
Zhengyang Li
,
Chenwei Liang
,
Timo Lohrenz
,
Marvin Sach
,
Björn Möller
,
Tim Fingscheidt
An Efficient and Noise-Robust Audiovisual Encoder for Audiovisual Speech Recognition.
INTERSPEECH
(2023)
Xiang Li
,
Songxiang Liu
,
Max W. Y. Lam
,
Zhiyong Wu
,
Chao Weng
,
Helen Meng
Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model.
INTERSPEECH
(2023)
Kaushal Santosh Bhogale
,
Sai Sundaresan
,
Abhigyan Raman
,
Tahir Javed
,
Mitesh M. Khapra
,
Pratyush Kumar
Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR.
INTERSPEECH
(2023)
Zih-Ching Chen
,
Chao-Han Huck Yang
,
Bo Li
,
Yu Zhang
,
Nanxin Chen
,
Shuo-Yiin Chang
,
Rohit Prabhavalkar
,
Hung-yi Lee
,
Tara N. Sainath
How to Estimate Model Transferability of Pre-Trained Speech Models?
INTERSPEECH
(2023)
Oleg Rybakov
,
Phoenix Meadowlark
,
Shaojin Ding
,
David Qiu
,
Jian Li
,
David Rim
,
Yanzhang He
2-bit Conformer quantization for automatic speech recognition.
INTERSPEECH
(2023)
Wenxuan Wang
,
Guodong Ma
,
Yuke Li
,
Binbin Du
Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition.
INTERSPEECH
(2023)
Caitlin Richter
,
Ragnar Pálsson
,
Luke O'Brien
,
Kolbrún Friðriksdóttir
,
Branislav Bédi
,
Eydís Huld Magnúsdóttir
,
Jón Guðnason
Orthography-based Pronunciation Scoring for Better CAPT Feedback.
INTERSPEECH
(2023)
Catarina Botelho
,
Alberto Abad
,
Tanja Schultz
,
Isabel Trancoso
Towards Reference Speech Characterization for Health Applications.
INTERSPEECH
(2023)
Lucía Gómez-Zaragozá
,
Simone Wills
,
Cristian Tejedor García
,
Javier Marín-Morales
,
Mariano Alcañiz
,
Helmer Strik
Alzheimer Disease Classification through ASR-based Transcriptions: Exploring the Impact of Punctuation and Pauses.
INTERSPEECH
(2023)
Jared Sharp
,
Matthew Faytak
,
Hasutai Fei Xiong Liu
Coarticulation of Sibe Vowels and Dorsal Fricatives in Spontaneous Speech: An Acoustic Study.
INTERSPEECH
(2023)