Login / Signup
Vít Suchomel
Publication Activity (10 Years)
Years Active: 2018-2024
Publications (10 Years): 8
Top Topics
Language Model
Ontology Engineering
Website
Web Corpora
Top Venues
RASLAN
CoRR
EAMT
Int. J. Artif. Intell. Tools
</>
Publications
</>
Nikola Ljubesic
,
Vít Suchomel
,
Peter Rupnik
,
Taja Kuzman
,
Rik van Noord
Language Models on a Diet: Cost-Efficient Development of Encoders for Closely-Related Languages via Additional Pretraining.
CoRR
(2024)
Marta Bañón
,
Miquel Esplà-Gomis
,
Mikel L. Forcada
,
Cristian García-Romero
,
Taja Kuzman
,
Nikola Ljubesic
,
Rik van Noord
,
Leopoldo Pla Sempere
,
Gema Ramírez-Sánchez
,
Peter Rupnik
,
Vít Suchomel
,
Antonio Toral
,
Tobias van der Werff
,
Jaume Zaragoza
MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages.
EAMT
(2022)
Vít Suchomel
,
Jan Kraus
Semi-Manual Annotation of Topics and Genres in Web Corpora, The Cheap and Fast Way.
RASLAN
(2022)
Vít Suchomel
,
Jan Kraus
Website Properties in Relation to the Quality of Text Extracted for Web Corpora.
RASLAN
(2021)
Vít Suchomel
Removing Spam from Web Corpora Through Supervised Learning and Semi-manual Classification of Web Sites.
RASLAN
(2020)
Ales Horák
,
Vít Baisa
,
Adam Rambousek
,
Vít Suchomel
A New Approach for Semi-Automatic Building and Extending a Multilingual Terminology Thesaurus.
Int. J. Artif. Intell. Tools
28 (2) (2019)
Vít Suchomel
Discriminating Between Similar Languages Using Large Web Corpora.
RASLAN
(2019)
Vít Suchomel
csTenTen17, a Recent Czech Web Corpus.
RASLAN
(2018)