This website is currently under live construction! For now, you probably should go here: http://www.cltl.nl/
Master Theses
Human Language Technology
- Sidi Wang (2024) LLMs as annotators for machine translation quality estimation (thesis)
- Rorick Terlou (2023) Increasing Readability with Disfluency Removal in Automatic Dutch Transcriptions (thesis)
- Marcel Feteke (2022) Cross-lingual Transfer Using Stacked Language Adapters (thesis ♦ internship at https://www.taus.net/ cum laude
- Eliza Hobo (2022) Simply accessible: Contextualized Lexical Simplification for Accessibility of Dutch Texts (thesis ♦ internship at Amsterdam Intelligence team) cum laude
- Sanne Hoeken (2022) Using Language Models for Analyzing Semantic Variation between Dutch Social Communities (thesis) cum laude
- Yilmaz Polat (2022) The Hallucinatory World of Automatic Text Generation (thesis
- Alessandra Polimeno (2022) Diversifying News Recommendation Systems by Detecting Fragmentation in News Story Chains (thesis )
- Charlotte Pouw (2022) Cross-lingual Transfer of Correlations between Linguistic Complexity and Human Reading Behaviour (thesis)
- Adrielli Lopez Rego (2022) Matching Ontologies in the Education Domain with Semantic Similarity (thesis ♦ internship at wizenoze)
- Vivian Claes (2021) ECBERT: Applying BERT to European Central Bank Communication to Predict Market Response (thesis ♦ internship at DNB)
- Søren K. Fomsgaard (2021) In the eye of the storm with style – Investigating style features in the language of QAnon on Twitter (thesis ♦ internship at TextGain)
- Sophie Neutel (2021) Towards automatic ontology alignment using BERT (thesis ♦ internship at TNO)
- Nathan van der Molen – Pater (2021) Information Usage in Coreference Resolution (thesis)
- András Aponyi (2020) Estimating Translation Quality Using Distributed Representations of Words and Sentences (thesis ♦ thesis github ♦ internship at https://www.taus.net/
- Klaudia Bartosiak (2020)Towards Formalizing Eligibility Criteria of Clinical Trials: Biomedical Entity Linking (thesis not available ♦ thesis github ♦ internship at https://mytomorrows.com/)
- Suzana Bašic (2020) Color as a Discriminative Property for Establishing Object Identity in Human-Robot Communication (thesis not available ♦ thesis github ♦ research project: CLTL-make robots talk and think )
- Lauren Green (2020) Semi-supervised Classification of Occupations using Pseudo-Labelling and Information Extraction (thesis not available ♦ internship at: https://greple.de/)
- Ngan Nguyen (2020) Clickbait anatomy: Identifying clickbait with machine learning (thesis )
- Jonathan Schaller (2020) Cross-domain evaluation of a question-answering classifier ( thesis not available
- Lisa Vasileva (2020) Machine Translation Detection for Neural Machine Translation Scenario (thesis ♦ internship at https://www.taus.net/)
- Karen Goes (2019) Exploring text mining techniques to structure a digitised catalogue (thesis ♦ internship at: https://www.kb.nl/)
- Liza King (2018) Modals and Measles: Computational linguistic investigations into modal use in the vaccination debate (thesis)
- Benedetta Torsi (2018) Detecting claims in a cross-register corpus (thesis)
- Pia Sommerauer (2017) From old to new racism? Investigating known dangers in distributional semantic approaches to conceptual change (thesis)
- Chantal van Son (2015) Towards a Dutch frame-semantic parser (thesis ♦ research project: CLTL-newsreader)
- Femke Klaver (2014) Authorship attribution of forum posts (thesis ♦ internship at: TNO)
Text Mining
- Selin Açikel (2024) Lost in Translation: Analyzing Machine Translation Quality Estimation with Synthetic Challenges (thesis)
- Yijing Zhang (2024) Usage of Generative Models to Ask Follow-up Questions for Health Monitoring (thesis)
- Natalia Khaidanova (2023) Machine-Translation Evaluation: Comparing Traditional and Neural Machine-Translation Evaluation Metrics for English→Russian (thesis)
- Quincy Liem (2023) On the limits of entity linking on domain-specific data (thesis)
- Konstantina Andronikou (2022) Automatic Retrieval of Topics Using Topic Modeling Techniques from Customer Conversations in the Airline Domain (thesis ♦ internship at Underlined)
- Ellemijn Galjaard (2022) Evaluating Transfer of a Functional Level Classifier from Secondary to Primary Healthcare Notes (thesis ♦ internship at VU Medical center)
- Lahorka Nikolovski (2022) Synthetic Data for Domain Adaptation in Neural Machine Translation (thesis ♦ internship at www.taus.net/) cum laude
- Myrthe Buckens (2022) Comparing and Evaluating Language Models for Conversational Data from the Medical Domain. (thesis ♦ internship at Autoscriber)
- Michiel van Nederpelt (2022) Evaluating a transformer-based language model under increasingly challenging conditions for the task of offensive language detection (thesis )
- Sharona Badloe (2022) MedRoBERTa.nl: Transfer Learning From COVID-19 to Cancer Patients (thesis ♦ internship at VU Medical center)
- Shuyi Shen (2022) Data to text generation with a joint entity and relation based method for a job advertisement (thesis ♦ internship at TextMetrics)
- Tessel Wisman (2022) Domain adaptation of end-to-end ASR via n-gram language modelling. (thesis ♦ internship at Amberscript) cum laude
- Sylvia Pronk (2022) A detailed comparison between two coreference systems and their effect on key-sentence extraction (thesis ♦ internship at DNB)
- Mira Reisiger (2022) Context-based entity linking of biomedical text (thesis ♦ internship at Elsevier)
- Jingyue Zhang (2022) Mapping text to learning objectives: A keyword-based text classification method (thesis ♦ internship at Edia)
- Yan Chung Li (2022) A Challenge Set for Natural Language Inference on but-inferred propositions (thesis) cum laude
- Elena Weber (2022) Automatic Topic Classification of Customer Feedback in the Banking Domain (thesis ♦ internship at Underlined)
- Anouk Twilt (2022) Sustainability in action: exploring automatically extracting actions from news-articles (thesis)
- Lois Rink (2022) Automatic Classification of Speech Acts in tax service letters (thesis ♦ internship at Belastingdienst)
- Giorgio Malinverni (2022) Analysing the Influence of Morphological Characteristics on the Performance of Few-Shot Prompting for Natural Language Inference in Cross-Lingual Settings (thesis )
- Eva den Uijl (2021) Detecting Discriminatory Language in Job Advertising Texts (thesis ♦ internship at TextMetrics)
- Guido Ansem (2021)The Effect of Auxiliary Data on Low Resource Languages in Aspect Extraction (thesis)
- Michelle Chan(2021) An Empirical Framework for Topic Modelling for Dutch Texts based on Newspaper Articles on Soil Pollution
- Melisha Lemain – van der Nest (2021) Named Entity Recognition: identifying NER Indicators in Dutch Police Reports (thesis ♦ internship at CBS).
- Dyon van der Ende (2021) Text Mining for Sustainability: Detecting Corporate Greenwashing with the Sustainable Development Goals (thesis)
- Gabriele Catanese (2021) A Transfer Learning approach to Aspect Based Sentiment Analysis for airline customer feedbacks (thesis ♦ internship at Underlined) cum laude !! Nominated for the Faculty of Humanities thesis prize 2021
- Stan Frinking (2021) Using Text Mining Techniques to Detect Fall Events in Medical Patient Notes (thesis ♦ internship at VU Medical center)
- Jasmine van Vugt (2021) Two Dutch fine-tuned BERT models: Named Entity Recognition and Named Entity Linking to increase findability of local geographical information. (thesis ♦ internship at CBS)
- Sanne Hamersma (2021) Explorative analysis of precursors of physical aggression in a health care institute: a Text Mining approach (thesis ♦ internship at : GGZ)
- Aju Shreshta (2021) BERTje-based Automatic Anonymisation of Dutch Police Reports (thesis ♦ internship at : CBS)
- Breta Micha (2021) Automatic Terminology Extraction in domain specific texts: a comparison between a rule-based system and a BERT-based system. (thesis)
- Jan van Casteren (2020) Automatic Attribution Extraction From Dutch News Articles: A Beginning (thesis ♦ thesis github research at: eScience center – inside the filter bubble
- Peter Caine (2020). Mind the gap: A comparison of linguistic vs deep-learning approaches to aspect extraction and aspect category detection (thesis ♦ thesis github)
- Luca Meima (2020) Finding potentially HIV defining conditions in medical reports (thesis ♦ thesis github ♦ internship at https://mytomorrows.com/)
- Eva Zegelaar (2020) An Automatic Emotion & Purpose Classifier for Dutch Tweets Written by Members of the Dutch Parliament (thesis ♦ thesis github ♦ internship at: https://reddata.nl/)