Improving Transformer-Based End-to-End Speech Recognition with Connectionist Temporal Classification and Language Model Integration.
Shigeki KaritaNelson Enrique Yalta SoplinShinji WatanabeMarc DelcroixAtsunori OgawaTomohiro NakataniPublished in: INTERSPEECH (2019)
Keyphrases
- speech recognition
- end to end
- language model
- language modeling
- pattern recognition
- speech signal
- n gram
- probabilistic model
- automatic speech recognition
- document retrieval
- word error rate
- retrieval model
- mixture model
- image classification
- neural network
- feature vectors
- noisy environments
- text localization and recognition
- query expansion
- feature selection
- feature extraction
- test collection
- query terms
- classification accuracy
- machine learning
- hidden markov models
- image processing
- speech recognition systems
- model selection
- support vector machine
- feature space
- information retrieval systems
- relevance model
- text classification
- handwriting recognition
- speaker identification
- information retrieval