Continuous Sign Language Recognition Through Cross-Modal Alignment of Video and Text Embeddings in a Joint-Latent Space.
Ilias PapastratisKosmas DimitropoulosDimitrios KonstantinidisPetros DarasPublished in: IEEE Access (2020)
Keyphrases
- latent space
- low dimensional
- latent variables
- dimensionality reduction
- manifold learning
- generative model
- high dimensional
- feature space
- video sequences
- multi modal
- visual data
- sign language
- multimedia
- high dimensional data
- video data
- distance metric
- text retrieval
- gesture recognition
- information retrieval
- transfer learning
- text documents
- video frames
- text mining
- probabilistic model
- space time
- human activities
- distance measure
- keywords
- hand tracking