Sign in
Taja Kuzman
Publication Activity (10 Years)
Years Active: 2017-2023
Publications (10 Years): 8
Top Topics
Training Data
Top Venues
CoRR
VarDial@EACL
EAMT
LREC
</>
Publications
</>
Peter Rupnik
,
Taja Kuzman
,
Nikola Ljubesic
BENCHić-lang: A Benchmark for Discriminating between Bosnian, Croatian, Montenegrin and Serbian.
VarDial@EACL
(2023)
Taja Kuzman
,
Igor Mozetic
,
Nikola Ljubesic
ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification.
CoRR
(2023)
Taja Kuzman
,
Peter Rupnik
,
Nikola Ljubesic
Get to Know Your Parallel Data: Performing English Variety and Genre Classification over MaCoCu Corpora.
VarDial@EACL
(2023)
Marta Bañón
,
Malina Chichirau
,
Miquel Esplà-Gomis
,
Mikel L. Forcada
,
Aarón Galiano Jiménez
,
Taja Kuzman
,
Nikola Ljubesic
,
Rik van Noord
,
Leopoldo Pla Sempere
,
Gema Ramírez-Sánchez
,
Peter Rupnik
,
Vit Suchomel
,
Antonio Toral
,
Jaume Zaragoza-Bernabeu
MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages.
EAMT
(2023)
Taja Kuzman
,
Peter Rupnik
,
Nikola Ljubesic
The GINCO Training Dataset for Web Genre Identification of Documents Out in the Wild.
CoRR
(2022)
Marta Bañón
,
Miquel Esplà-Gomis
,
Mikel L. Forcada
,
Cristian García-Romero
,
Taja Kuzman
,
Nikola Ljubesic
,
Rik van Noord
,
Leopoldo Pla Sempere
,
Gema Ramírez-Sánchez
,
Peter Rupnik
,
Vít Suchomel
,
Antonio Toral
,
Tobias van der Werff
,
Jaume Zaragoza
MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages.
EAMT
(2022)
Taja Kuzman
,
Peter Rupnik
,
Nikola Ljubesic
The GINCO Training Dataset for Web Genre Identification of Documents Out in the Wild.
LREC
(2022)
Polona Gantar
,
Simon Krek
,
Taja Kuzman
Verbal Multiword Expressions in Slovene.
Europhras
(2017)