Login / Signup
StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection.
Sara Papi
Marco Gaido
Matteo Negri
Luisa Bentivogli
Published in:
CoRR (2024)
Keyphrases
</>
media streams
multimedia
machine translation
visual information
visual attention
neural network
real time
selection strategy
data sets
multi modal
streaming data
query translation
digital video
speaker identification
historical information