Improving Speech Translation Accuracy and Time Efficiency With Fine-Tuned wav2vec 2.0-Based Speech Segmentation.
Ryo FukudaKatsuhito SudohSatoshi NakamuraPublished in: IEEE ACM Trans. Audio Speech Lang. Process. (2024)
Keyphrases
- speech recognition
- speech signal
- fine tuned
- computational efficiency
- spoken language
- text to speech
- automatic speech recognition
- prediction accuracy
- segmentation algorithm
- high accuracy
- recognition engine
- computational complexity
- broadcast news
- endpoint detection
- speech synthesis
- segmentation accuracy
- image segmentation
- error rate
- classification accuracy
- object segmentation
- shape prior
- dialogue system
- machine translation
- segmentation method
- human computer interaction
- multimodal interfaces
- level set
- edge detection
- computational cost