Login / Signup
Yao Qian
ORCID
Publication Activity (10 Years)
Years Active: 2001-2024
Publications (10 Years): 89
Top Topics
Hierarchically Structured
Speech Recognition
Spoken Language
Neural Network
Top Venues
CoRR
ICASSP
INTERSPEECH
ASRU
</>
Publications
</>
Leying Zhang
,
Yao Qian
,
Long Zhou
,
Shujie Liu
,
Dongmei Wang
,
Xiaofei Wang
,
Midia Yousefi
,
Yanmin Qian
,
Jinyu Li
,
Lei He
,
Sheng Zhao
,
Michael Zeng
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations.
CoRR
(2024)
Shaoshi Ling
,
Yuxuan Hu
,
Shuangbei Qian
,
Guoli Ye
,
Yao Qian
,
Yifan Gong
,
Ed Lin
,
Michael Zeng
Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition.
ICASSP
(2024)
Haojie Tang
,
Gang Liu
,
Yao Qian
,
Jiebang Wang
,
Jinxin Xiong
EgeFusion: Towards Edge Gradient Enhancement in Infrared and Visible Image Fusion With Multi-Scale Transform.
IEEE Trans. Computational Imaging
10 (2024)
Jiacheng Wu
,
Gang Liu
,
Xiao Wang
,
Haojie Tang
,
Yao Qian
GAN-GA: infrared and visible image fusion generative adversarial network based on global awareness.
Appl. Intell.
54 (13-14) (2024)
Sanyuan Chen
,
Shujie Liu
,
Long Zhou
,
Yanqing Liu
,
Xu Tan
,
Jinyu Li
,
Sheng Zhao
,
Yao Qian
,
Furu Wei
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers.
CoRR
(2024)
Chenyang Le
,
Yao Qian
,
Dongmei Wang
,
Long Zhou
,
Shujie Liu
,
Xiaofei Wang
,
Midia Yousefi
,
Yanmin Qian
,
Jinyu Li
,
Sheng Zhao
,
Michael Zeng
TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation.
CoRR
(2024)
Kaixin Li
,
Gang Liu
,
Xinjie Gu
,
Haojie Tang
,
Jinxin Xiong
,
Yao Qian
DANT-GAN: A dual attention-based of nested training network for infrared and visible image fusion.
Digit. Signal Process.
145 (2024)
Rui Chang
,
Gang Liu
,
Haojie Tang
,
Yao Qian
,
Jianchao Tang
RDGMEF: a multi-exposure image fusion framework based on Retinex decompostion and guided filter.
Neural Comput. Appl.
36 (20) (2024)
Mengliang Xing
,
Gang Liu
,
Haojie Tang
,
Yao Qian
,
Jun Zhang
CFNet: An infrared and visible image compression fusion network.
Pattern Recognit.
156 (2024)
Chenyang Le
,
Yao Qian
,
Long Zhou
,
Shujie Liu
,
Yanmin Qian
,
Michael Zeng
,
Xuedong Huang
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation.
NeurIPS
(2023)
Chenda Li
,
Yao Qian
,
Zhuo Chen
,
Naoyuki Kanda
,
Dongmei Wang
,
Takuya Yoshioka
,
Yanmin Qian
,
Michael Zeng
Adapting Multi-Lingual ASR Models for Handling Multiple Talkers.
CoRR
(2023)
Heming Wang
,
Yao Qian
,
Hemin Yang
,
Nauyuki Kanda
,
Peidong Wang
,
Takuya Yoshioka
,
Xiaofei Wang
,
Yiming Wang
,
Shujie Liu
,
Zhuo Chen
,
DeLiang Wang
,
Michael Zeng
DATA2VEC-SG: Improving Self-Supervised Learning Representations for Speech Generation Tasks.
ICASSP
(2023)
Chenda Li
,
Yao Qian
,
Zhuo Chen
,
Naoyuki Kanda
,
Dongmei Wang
,
Takuya Yoshioka
,
Yanmin Qian
,
Michael Zeng
Adapting Multi-Lingual ASR Models for Handling Multiple Talkers.
INTERSPEECH
(2023)
Haibin Yu
,
Yuxuan Hu
,
Yao Qian
,
Ma Jin
,
Linquan Liu
,
Shujie Liu
,
Yu Shi
,
Yanmin Qian
,
Edward Lin
,
Michael Zeng
Code-Switching Text Generation and Injection in Mandarin-English ASR.
ICASSP
(2023)
Leying Zhang
,
Yao Qian
,
Linfeng Yu
,
Heming Wang
,
Xinkai Wang
,
Hemin Yang
,
Long Zhou
,
Shujie Liu
,
Yanmin Qian
,
Michael Zeng
Diffusion Conditional Expectation Model for Efficient and Robust Target Speech Extraction.
CoRR
(2023)
Chenda Li
,
Yao Qian
,
Zhuo Chen
,
Dongmei Wang
,
Takuya Yoshioka
,
Shujie Liu
,
Yanmin Qian
,
Michael Zeng
Target Sound Extraction with Variable Cross-Modality Clues.
ICASSP
(2023)
Haibin Yu
,
Yuxuan Hu
,
Yao Qian
,
Ma Jin
,
Linquan Liu
,
Shujie Liu
,
Yu Shi
,
Yanmin Qian
,
Edward Lin
,
Michael Zeng
Code-Switching Text Generation and Injection in Mandarin-English ASR.
CoRR
(2023)
Ziyi Yang
,
Yuwei Fang
,
Chenguang Zhu
,
Reid Pryzant
,
Dongdong Chen
,
Yu Shi
,
Yichong Xu
,
Yao Qian
,
Mei Gao
,
Yi-Ling Chen
,
Liyang Lu
,
Yujia Xie
,
Robert Gmyr
,
Noel Codella
,
Naoyuki Kanda
,
Bin Xiao
,
Lu Yuan
,
Takuya Yoshioka
,
Michael Zeng
,
Xuedong Huang
i-Code: An Integrative and Composable Multimodal Learning Framework.
AAAI
(2023)
Ziyi Yang
,
Mahmoud Khademi
,
Yichong Xu
,
Reid Pryzant
,
Yuwei Fang
,
Chenguang Zhu
,
Dongdong Chen
,
Yao Qian
,
Mei Gao
,
Yi-Ling Chen
,
Robert Gmyr
,
Naoyuki Kanda
,
Noel Codella
,
Bin Xiao
,
Yu Shi
,
Lu Yuan
,
Takuya Yoshioka
,
Michael Zeng
,
Xuedong Huang
i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data.
CoRR
(2023)
Yuwei Fang
,
Mahmoud Khademi
,
Chenguang Zhu
,
Ziyi Yang
,
Reid Pryzant
,
Yichong Xu
,
Yao Qian
,
Takuya Yoshioka
,
Lu Yuan
,
Michael Zeng
,
Xuedong Huang
i-Code Studio: A Configurable and Composable Framework for Integrative AI.
CoRR
(2023)
Chenyang Le
,
Yao Qian
,
Long Zhou
,
Shujie Liu
,
Michael Zeng
,
Xuedong Huang
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation.
CoRR
(2023)
Chenda Li
,
Yao Qian
,
Zhuo Chen
,
Dongmei Wang
,
Takuya Yoshioka
,
Shujie Liu
,
Yanmin Qian
,
Michael Zeng
Target Sound Extraction with Variable Cross-modality Clues.
CoRR
(2023)
Zhengyang Chen
,
Yao Qian
,
Bing Han
,
Yanmin Qian
,
Michael Zeng
A comprehensive study on self-supervised distillation for speaker representation learning.
CoRR
(2022)
Zhengyang Chen
,
Sanyuan Chen
,
Yu Wu
,
Yao Qian
,
Chengyi Wang
,
Shujie Liu
,
Yanmin Qian
,
Michael Zeng
Large-Scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification.
ICASSP
(2022)
Junyi Ao
,
Ziqiang Zhang
,
Long Zhou
,
Shujie Liu
,
Haizhou Li
,
Tom Ko
,
Lirong Dai
,
Jinyu Li
,
Yao Qian
,
Furu Wei
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data.
INTERSPEECH
(2022)
Mostafa Karimi
,
Changliang Liu
,
Ken'ichi Kumatani
,
Yao Qian
,
Tianyu Wu
,
Jian Wu
Deploying self-supervised learning in the wild for hybrid automatic speech recognition.
CoRR
(2022)
Ziyi Yang
,
Yuwei Fang
,
Chenguang Zhu
,
Reid Pryzant
,
Dongdong Chen
,
Yu Shi
,
Yichong Xu
,
Yao Qian
,
Mei Gao
,
Yi-Ling Chen
,
Liyang Lu
,
Yujia Xie
,
Robert Gmyr
,
Noel Codella
,
Naoyuki Kanda
,
Bin Xiao
,
Lu Yuan
,
Takuya Yoshioka
,
Michael Zeng
,
Xuedong Huang
i-Code: An Integrative and Composable Multimodal Learning Framework.
CoRR
(2022)
Heming Wang
,
Yao Qian
,
Xiaofei Wang
,
Yiming Wang
,
Chengyi Wang
,
Shujie Liu
,
Takuya Yoshioka
,
Jinyu Li
,
DeLiang Wang
Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction.
ICASSP
(2022)
Junyi Ao
,
Ziqiang Zhang
,
Long Zhou
,
Shujie Liu
,
Haizhou Li
,
Tom Ko
,
Lirong Dai
,
Jinyu Li
,
Yao Qian
,
Furu Wei
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data.
CoRR
(2022)
Junyi Ao
,
Rui Wang
,
Long Zhou
,
Chengyi Wang
,
Shuo Ren
,
Yu Wu
,
Shujie Liu
,
Tom Ko
,
Qing Li
,
Yu Zhang
,
Zhihua Wei
,
Yao Qian
,
Jinyu Li
,
Furu Wei
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing.
ACL (1)
(2022)
Sanyuan Chen
,
Chengyi Wang
,
Zhengyang Chen
,
Yu Wu
,
Shujie Liu
,
Zhuo Chen
,
Jinyu Li
,
Naoyuki Kanda
,
Takuya Yoshioka
,
Xiong Xiao
,
Jian Wu
,
Long Zhou
,
Shuo Ren
,
Yanmin Qian
,
Yao Qian
,
Jian Wu
,
Michael Zeng
,
Xiangzhan Yu
,
Furu Wei
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing.
IEEE J. Sel. Top. Signal Process.
16 (6) (2022)
Gang Liu
,
Tianyan Zhou
,
Yong Zhao
,
Yu Wu
,
Zhuo Chen
,
Yao Qian
,
Jian Wu
The Microsoft System for VoxCeleb Speaker Recognition Challenge 2022.
CoRR
(2022)
Chengyi Wang
,
Yu Wu
,
Sanyuan Chen
,
Shujie Liu
,
Jinyu Li
,
Yao Qian
,
Zhenglu Yang
Improving Self-Supervised Learning for Speech Recognition with Intermediate Layer Supervision.
ICASSP
(2022)
Wei Wang
,
Shuo Ren
,
Yao Qian
,
Shujie Liu
,
Yu Shi
,
Yanmin Qian
,
Michael Zeng
Optimizing Alignment of Speech and Language Latent Spaces for End-To-End Speech Recognition and Understanding.
ICASSP
(2022)
Sanyuan Chen
,
Yu Wu
,
Chengyi Wang
,
Zhengyang Chen
,
Zhuo Chen
,
Shujie Liu
,
Jian Wu
,
Yao Qian
,
Furu Wei
,
Jinyu Li
,
Xiangzhan Yu
Unispeech-Sat: Universal Speech Representation Learning With Speaker Aware Pre-Training.
ICASSP
(2022)
Zhengyang Chen
,
Yao Qian
,
Bing Han
,
Yanmin Qian
,
Michael Zeng
A Comprehensive Study on Self-Supervised Distillation for Speaker Representation Learning.
SLT
(2022)
Yiming Wang
,
Jinyu Li
,
Heming Wang
,
Yao Qian
,
Chengyi Wang
,
Yu Wu
Wav2vec-Switch: Contrastive Learning from Original-Noisy Speech Pairs for Robust Speech Recognition.
ICASSP
(2022)
Rimita Lahiri
,
Ken'ichi Kumatani
,
Eric Sun
,
Yao Qian
Multilingual Speech Recognition using Knowledge Transfer across Learning Processes.
CoRR
(2021)
Yao Qian
,
Ximo Bian
,
Yu Shi
,
Naoyuki Kanda
,
Leo Shen
,
Zhen Xiao
,
Michael Zeng
Speech-Language Pre-Training for End-to-End Spoken Language Understanding.
ICASSP
(2021)
Chengyi Wang
,
Yu Wu
,
Yao Qian
,
Ken'ichi Kumatani
,
Shujie Liu
,
Furu Wei
,
Michael Zeng
,
Xuedong Huang
UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data.
CoRR
(2021)
Xinhao Wang
,
Keelan Evanini
,
Yao Qian
,
Matthew Mulholland
Automated Scoring of Spontaneous Speech from Young Learners of English Using Transformers.
SLT
(2021)
Zhengyang Chen
,
Sanyuan Chen
,
Yu Wu
,
Yao Qian
,
Chengyi Wang
,
Shujie Liu
,
Yanmin Qian
,
Michael Zeng
Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification.
CoRR
(2021)
Sanyuan Chen
,
Chengyi Wang
,
Zhengyang Chen
,
Yu Wu
,
Shujie Liu
,
Zhuo Chen
,
Jinyu Li
,
Naoyuki Kanda
,
Takuya Yoshioka
,
Xiong Xiao
,
Jian Wu
,
Long Zhou
,
Shuo Ren
,
Yanmin Qian
,
Yao Qian
,
Jian Wu
,
Michael Zeng
,
Furu Wei
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing.
CoRR
(2021)
Sanyuan Chen
,
Yu Wu
,
Chengyi Wang
,
Zhengyang Chen
,
Zhuo Chen
,
Shujie Liu
,
Jian Wu
,
Yao Qian
,
Furu Wei
,
Jinyu Li
,
Xiangzhan Yu
UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training.
CoRR
(2021)
Wei Wang
,
Shuo Ren
,
Yao Qian
,
Shujie Liu
,
Yu Shi
,
Yanmin Qian
,
Michael Zeng
Optimizing Alignment of Speech and Language Latent Spaces for End-to-End Speech Recognition and Understanding.
CoRR
(2021)
Yao Qian
,
Ximo Bian
,
Yu Shi
,
Naoyuki Kanda
,
Leo Shen
,
Zhen Xiao
,
Michael Zeng
Speech-language Pre-training for End-to-end Spoken Language Understanding.
CoRR
(2021)
Heming Wang
,
Yao Qian
,
Xiaofei Wang
,
Yiming Wang
,
Chengyi Wang
,
Shujie Liu
,
Takuya Yoshioka
,
Jinyu Li
,
DeLiang Wang
Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction.
CoRR
(2021)
Chengyi Wang
,
Yu Wu
,
Yao Qian
,
Ken'ichi Kumatani
,
Shujie Liu
,
Furu Wei
,
Michael Zeng
,
Xuedong Huang
UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data.
ICML
(2021)
Yiming Wang
,
Jinyu Li
,
Heming Wang
,
Yao Qian
,
Chengyi Wang
,
Yu Wu
Wav2vec-Switch: Contrastive Learning from Original-noisy Speech Pairs for Robust Speech Recognition.
CoRR
(2021)
Chengyi Wang
,
Yu Wu
,
Sanyuan Chen
,
Shujie Liu
,
Jinyu Li
,
Yao Qian
,
Zhenglu Yang
Self-Supervised Learning for speech recognition with Intermediate layer supervision.
CoRR
(2021)
Ying Qin
,
Yao Qian
,
Anastassia Loukina
,
Patrick L. Lange
,
Abhinav Misra
,
Keelan Evanini
,
Tan Lee
Automatic Detection of Word-Level Reading Errors in Non-native English Speech Based on ASR Output.
ISCSLP
(2021)
Junyi Ao
,
Rui Wang
,
Long Zhou
,
Shujie Liu
,
Shuo Ren
,
Yu Wu
,
Tom Ko
,
Qing Li
,
Yu Zhang
,
Zhihua Wei
,
Yao Qian
,
Jinyu Li
,
Furu Wei
SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing.
CoRR
(2021)
Yao Qian
,
Yu Shi
,
Michael Zeng
Discriminative Transfer Learning for Optimizing ASR and Semantic Labeling in Task-Oriented Spoken Dialog.
INTERSPEECH
(2020)
Yao Qian
,
Rutuja Ubale
,
Patrick L. Lange
,
Keelan Evanini
,
Vikram Ramanarayanan
,
Frank K. Soong
Spoken Language Understanding of Human-Machine Conversations for Language Learning Applications.
J. Signal Process. Syst.
92 (8) (2020)
Xinhao Wang
,
Keelan Evanini
,
Matthew Mulholland
,
Yao Qian
,
James V. Bruno
Application of an Automatic Plagiarism Detection System in a Large-scale Assessment of English Speaking Proficiency.
BEA@ACL
(2019)
Xinhao Wang
,
Keelan Evanini
,
Yao Qian
,
Klaus Zechner
Using Very Deep Convolutional Neural Networks to Automatically Detect Plagiarized Spoken Responses.
ASRU
(2019)
Peng Cao
,
Yao Qian
,
Pan Xue
,
Danzhu Lu
,
Jie He
,
Zhiliang Hong
A Bipolar-Input Thermoelectric Energy-Harvesting Interface With Boost/Flyback Hybrid Converter and On-Chip Cold Starter.
IEEE J. Solid State Circuits
54 (12) (2019)
Rutuja Ubale
,
Vikram Ramanarayanan
,
Yao Qian
,
Keelan Evanini
,
Chee Wee Leong
,
Chong Min Lee
Native Language Identification from Raw Waveforms Using Deep Convolutional Neural Networks with Attentive Pooling.
ASRU
(2019)
Chee Wee Leong
,
Katrina Roohr
,
Vikram Ramanarayanan
,
Michelle P. Martin-Raugh
,
Harrison Kell
,
Rutuja Ubale
,
Yao Qian
,
Zydrune Mladineo
,
Laura McCulla
To Trust, or Not to Trust? A Study of Human Bias in Automated Video Interview Assessments.
CoRR
(2019)
Xinhao Wang
,
Su-Youn Yoon
,
Keelan Evanini
,
Klaus Zechner
,
Yao Qian
Automatic Detection of Off-Topic Spoken Responses Using Very Deep Convolutional Neural Networks.
INTERSPEECH
(2019)
Anastassia Loukina
,
Beata Beigman Klebanov
,
Patrick L. Lange
,
Yao Qian
,
Binod Gyawali
,
Nitin Madnani
,
Abhinav Misra
,
Klaus Zechner
,
Zuowei Wang
,
John Sabatini
Automated Estimation of Oral Reading Fluency During Summer Camp e-Book Reading with MyTurnToRead.
INTERSPEECH
(2019)
Chee Wee Leong
,
Katrina Roohr
,
Vikram Ramanarayanan
,
Michelle P. Martin-Raugh
,
Harrison Kell
,
Rutuja Ubale
,
Yao Qian
,
Zydrune Mladineo
,
Laura McCulla
Are Humans Biased in Assessment of Video Interviews?
ICMI (Adjunct)
(2019)
Vikram Ramanarayanan
,
Matthew Mulholland
,
Yao Qian
Scoring Interactional Aspects of Human-Machine Dialog for Language Learning and Assessment using Text Features.
SIGdial
(2019)
Peng Cao
,
Yao Qian
,
Pan Xue
,
Danzhu Lu
,
Jie He
,
Zhiliang Hong
An 84% Peak Efficiency Bipolar-Input Boost/Flyback Hybrid Converter With MPPT and on-Chip Cold Starter for Thermoelectric Energy Harvesting.
ISSCC
(2019)
Yao Qian
,
Patrick L. Lange
,
Keelan Evanini
,
Robert A. Pugh
,
Rutuja Ubale
,
Matthew Mulholland
,
Xinhao Wang
Neural Approaches to Automated Speech Scoring of Monologue and Dialogue Responses.
ICASSP
(2019)
Keelan Evanini
,
Matthew Mulholland
,
Rutuja Ubale
,
Yao Qian
,
Robert A. Pugh
,
Vikram Ramanarayanan
,
Aoife Cahill
Improvements to an Automated Content Scoring System for Spoken CALL Responses: the ETS Submission to the Second Spoken CALL Shared Task.
INTERSPEECH
(2018)
Yao Qian
,
Danzhu Lu
,
Jie He
,
Zhiliang Hong
An On-Chip Transformer-Based Self-Startup Hybrid SIDITO Converter for Thermoelectric Energy Harvesting.
IEEE Trans. Circuits Syst. II Express Briefs
(11) (2018)
Rutuja Ubale
,
Yao Qian
,
Keelan Evanini
Exploring End-To-End Attention-Based Neural Networks For Native Language Identification.
SLT
(2018)
Yao Qian
,
Rutuja Ubale
,
Matthew Mulholland
,
Keelan Evanini
,
Xinhao Wang
A Prompt-Aware Neural Network Approach to Content-Based Scoring of Non-Native Spontaneous Speech.
SLT
(2018)
Zhaoheng Ni
,
Rutuja Ubale
,
Yao Qian
,
Michael I. Mandel
,
Su-Youn Yoon
,
Abhinav Misra
,
David Suendermann-Oeft
Unusable Spoken Response Detection with BLSTM Neural Networks.
ISCSLP
(2018)
Yao Qian
,
Rutuja Ubale
,
Patrick L. Lange
,
Keelan Evanini
,
Frank K. Soong
From Speech Signals to Semantics - Tagging Performance at Acoustic, Phonetic and Word Levels.
ISCSLP
(2018)
Vikram Ramanarayanan
,
Robert Pugh
,
Yao Qian
,
David Suendermann-Oeft
Automatic Turn-Level Language Identification for Code-Switched Spanish-English Dialog.
IWSDS
(2018)
Lei Chen
,
Jidong Tao
,
Shabnam Ghaffarzadegan
,
Yao Qian
End-to-End Neural Network Based Automated Speech Scoring.
ICASSP
(2018)
Shervin Malmasi
,
Keelan Evanini
,
Aoife Cahill
,
Joel R. Tetreault
,
Robert A. Pugh
,
Christopher Hamill
,
Diane Napolitano
,
Yao Qian
A Report on the 2017 Native Language Identification Shared Task.
BEA@EMNLP
(2017)
Yao Qian
,
Rutuja Ubale
,
Vikram Ramanarayanan
,
Patrick L. Lange
,
David Suendermann-Oeft
,
Keelan Evanini
,
Eugene Tsuprun
Exploring ASR-free end-to-end modeling to improve spoken language understanding in a cloud-based dialog system.
ASRU
(2017)
Yao Qian
,
Hongguang Zhang
,
Yanqin Chen
,
Yajie Qin
,
Danzhu Lu
,
Zhiliang Hong
A SIDIDO DC-DC Converter With Dual-Mode and Programmable-Capacitor-Array MPPT Control for Thermoelectric Energy Harvesting.
IEEE Trans. Circuits Syst. II Express Briefs
(8) (2017)
Yao Qian
,
Keelan Evanini
,
Patrick L. Lange
,
Robert A. Pugh
,
Rutuja Ubale
,
Frank K. Soong
Improving native language (L1) identifation with better VAD and TDNN trained separately on native and non-native English corpora.
ASRU
(2017)
Keelan Evanini
,
Matthew Mulholland
,
Eugene Tsuprun
,
Yao Qian
Using an Automated Content Scoring Engine for Spoken CALL Responses: The ETS submission for the Spoken CALL Challenge.
SLaTE
(2017)
Anastassia Loukina
,
Beata Beigman Klebanov
,
Patrick L. Lange
,
Binod Gyawali
,
Yao Qian
Developing speech processing technologies for shared book reading with a computer.
WOCCI
(2017)
Yao Qian
,
Keelan Evanini
,
Xinhao Wang
,
Chong Min Lee
,
Matthew Mulholland
Bidirectional LSTM-RNN for Improving Automated Assessment of Non-Native Children's Speech.
INTERSPEECH
(2017)
Yao Qian
,
Keelan Evanini
,
Xinhao Wang
,
David Suendermann-Oeft
,
Robert A. Pugh
,
Patrick L. Lange
,
Hillary R. Molloy
,
Frank K. Soong
Improving Sub-Phone Modeling for Better Native Language Identification with Non-Native English Speech.
INTERSPEECH
(2017)
Yao Qian
,
Xinhao Wang
,
Keelan Evanini
,
David Suendermann-Oeft
Self-Adaptive DNN for Improving Spoken Language Proficiency Assessment.
INTERSPEECH
(2016)
Yuchen Fan
,
Yao Qian
,
Frank K. Soong
,
Lei He
Speaker and language factorization in DNN-based TTS synthesis.
ICASSP
(2016)
Yao Qian
,
Jidong Tao
,
David Suendermann-Oeft
,
Keelan Evanini
,
Alexei V. Ivanov
,
Vikram Ramanarayanan
Noise and Metadata Sensitive Bottleneck Features for Improving Speaker Recognition with Non-Native Speech Input.
INTERSPEECH
(2016)
Peilu Wang
,
Yao Qian
,
Frank K. Soong
,
Lei He
,
Hai Zhao
Learning Distributed Word Representations For Bidirectional LSTM Recurrent Neural Network.
HLT-NAACL
(2016)
Xiang Yin
,
Ming Lei
,
Yao Qian
,
Frank K. Soong
,
Lei He
,
Zhen-Hua Ling
,
Li-Rong Dai
Modeling F0 trajectories in hierarchically structured deep neural networks.
Speech Commun.
76 (2016)
Matthew Mulholland
,
Melissa Lopez
,
Keelan Evanini
,
Anastassia Loukina
,
Yao Qian
A comparison of ASR and human errors for transcription of non-native spontaneous speech.
ICASSP
(2016)
Yao Qian
,
Xinhao Wang
,
Keelan Evanini
,
David Suendermann-Oeft
Improving DNN-Based Automatic Recognition of Non-native Children Speech with Adult Speech.
WOCCI
(2016)
Yuchen Fan
,
Yao Qian
,
Frank K. Soong
,
Lei He
Unsupervised speaker adaptation for DNN-based TTS synthesis.
ICASSP
(2016)
Peilu Wang
,
Yao Qian
,
Frank K. Soong
,
Lei He
,
Hai Zhao
Word embedding for recurrent neural network based TTS synthesis.
ICASSP
(2015)
Wenping Hu
,
Yao Qian
,
Frank K. Soong
,
Yong Wang
Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers.
Speech Commun.
67 (2015)
Peilu Wang
,
Yao Qian
,
Frank K. Soong
,
Lei He
,
Hai Zhao
A Unified Tagging Solution: Bidirectional LSTM Recurrent Neural Network with Word Embedding.
CoRR
(2015)
Zhou Yu
,
Vikram Ramanarayanan
,
David Suendermann-Oeft
,
Xinhao Wang
,
Klaus Zechner
,
Lei Chen
,
Jidong Tao
,
Aliaksei Ivanou
,
Yao Qian
Using bidirectional lstm recurrent neural networks to learn high-level abstractions of sequential features for automated scoring of non-native spontaneous speech.
ASRU
(2015)
Yuchen Fan
,
Yao Qian
,
Frank K. Soong
,
Lei He
Multi-speaker modeling and speaker adaptation for DNN-based TTS synthesis.
ICASSP
(2015)
Peilu Wang
,
Yao Qian
,
Frank K. Soong
,
Lei He
,
Hai Zhao
Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network.
CoRR
(2015)
Yuchen Fan
,
Yao Qian
,
Frank K. Soong
,
Lei He
Sequence generation error (SGE) minimization based deep neural networks training for text-to-speech synthesis.
INTERSPEECH
(2015)
Wenping Hu
,
Yao Qian
,
Frank K. Soong
An improved DNN-based approach to mispronunciation detection and diagnosis of L2 learners' speech.
SLaTE
(2015)
Yuchen Fan
,
Yao Qian
,
Feng-Long Xie
,
Frank K. Soong
TTS synthesis with bidirectional LSTM based recurrent neural networks.
INTERSPEECH
(2014)
Yao Qian
,
Yuchen Fan
,
Wenping Hu
,
Frank K. Soong
On the training aspects of Deep Neural Network (DNN) for parametric TTS synthesis.
ICASSP
(2014)
Changqin Quan
,
Yao Qian
,
Fuji Ren
Dynamic facial expression recognition based on K-order emotional intensity model.
ROBIO
(2014)