CLTL

Master Theses

Research Master Human Language Technology

Sidi Wang, 2024. LLMs as annotators for machine translation quality estimation (pdf)
Yee Man Ng, 2024. A Comparative Study of Open-Source and Closed-Source Large Language Models for Native Language Identification (pdf)
Agnieszka Kluska, 2023. Adapting Microportrait Extraction for Queer Stereotype Identification in Polish Online News (pdf)
Bas Diender, 2023. Random Seed Influence on Language Model Generalizabilit (pdf)
Dorien Renting, 2023. Multi-task fine-tuning for hate speech detection (pdf)
Marije Brandsma, 2023. Decoding Populism: Analyzing Lexical Choice and Linguistic Simplicity in Tweets (pdf)
Mekselina Doğanç, 2023. Automatic Generation of Personalized Counter Narratives Based on User Profile (pdf)
Mojca Kloos, 2023. Mitigating Gender Bias with Deep Reinforcement Learning (pdf)
Rorick Terlou, 2023. Increasing Readability with Disfluency Removal in Automatic Dutch Transcriptions (pdf)
Vasiliki Kyrmanidi, 2023. Exploring the Impact of Structured Dialogue Representation on Neural Dialogue Response Generation (pdf)
Adrielli Lopez Rego, 2022. Matching Ontologies in the Education Domain with Semantic Similarity
Alessandra Polimeno, 2022. Diversifying News Recommendation Systems by Detecting Fragmentation in News Story Chains (pdf)
Charlotte Pouw, 2022. Cross-lingual Transfer of Correlations between Linguistic Complexity and Human Reading Behaviour (pdf)
Eliza Hobo, 2022. Simply accessible: Contextualized Lexical Simplification for Accessibility of Dutch Texts
Marcel Feteke, 2022. Cross-lingual Transfer Using Stacked Language Adapters
Sanne Hoeken, 2022. Using Language Models for Analyzing Semantic Variation between Dutch Social Communities (pdf)
Yilmaz Polat, 2022. The Hallucinatory World of Automatic Text Generation
Nathan van der Molen-Pater, 2021. Information Usage in Coreference Resolution
Sophie Neutel, 2021. Towards automatic ontology alignment using BERT
Søren K. Fomsgaard, 2021. In the eye of the storm with style – Investigating style features in the language of QAnon on Twitter
Vivian Claes, 2021. ECBERT: Applying BERT to European Central Bank Communication to Predict Market Response
András Aponyi, 2020. Estimating Translation Quality Using Distributed Representations of Words and Sentences
Jonathan Schaller, 2020. Cross-domain evaluation of a question-answering classifier
Klaudia Bartosiak, 2020. Towards Formalizing Eligibility Criteria of Clinical Trials: Biomedical Entity Linking
Lauren Green, 2020. Semi-supervised Classification of Occupations using Pseudo-Labelling and Information Extraction (pdf)
Lisa Vasileva, 2020. Machine Translation Detection for Neural Machine Translation Scenario
Ngan Nguyen, 2020. Clickbait anatomy: Identifying clickbait with machine learning
Suzana Bašic, 2020. Color as a Discriminative Property for Establishing Object Identity in Human-Robot Communication
Karen Goes, 2019. Exploring text mining techniques to structure a digitised catalogue
Benedetta Torsi, 2018. Detecting claims in a cross-register corpus
Liza King, 2018. Modals and Measles: Computational linguistic investigations into modal use in the vaccination debate
Pia Sommerauer, 2017. From old to new racism? Investigating known dangers in distributional semantic approaches to conceptual change
Chantal van Son, 2015. Towards a Dutch frame-semantic parser
Femke Klaver, 2014. Authorship attribution of forum posts

Master Language and AI (formerly Text Mining)

Furong Zou, 2024. Exploring An Existing ASR Model for a Binary Classification of Intelligibility on MOOC English Speech Data (pdf)
Irma Tuinenga, 2024. Words Made Easy: a Comparative Study of Methods for English Lexical Simplification (pdf)
Chuqiao Guo, 2024. Extracting Activity Information with LLMs Using GPT-Generated Data (pdf)
Alyssa MacGregor-Hastie, 2024. Chats, Agents and Lyrics (pdf)
Long Ma, 2024. Chinese Healthcare Named Entity Recognition (CHNER) Using BiLSTM-CRF Classifiers (pdf)
Selin Açikel, 2024. Lost in Translation: Analyzing Machine Translation Quality Estimation with Synthetic Challenges (pdf)
Yijing Zhang, 2024. Usage of Generative Models to Ask Follow-up Questions for Health Monitoring (pdf)
Csenge Szabó, 2024. Multi-Label Topic Classification of Client Feedback in the Governance Domain (pdf)
Payam Fakhraei, 2024. Context-Aware Hate Speech Detection using BERT: An Investigation with the Contextual Abuse Dataset (pdf)
Murat Ertas, 2024. Improving Medical Text Classifiers with Balanced Datasets (pdf)
Adam Tucker, 2023. Master Thesis An investigation of complex word identification (CWI) systems for English (pdf)
Cecilia Schramm, 2023. Using Semi-supervised Learning to Automatically Annotate Dutch Medical Notes for Patients’ Functioning Levels (pdf)
Hasan Shahoud, 2023. Discovering Hidden Cues using TF-IDF and their Relevance on Cultural Inter-dependency (pdf)
Ajda Efendi, 2023. Document Classification on EQF levels with Multilingual datasets in English (pdf)
Saloni Singh, 2023. Leveraging university curricula and course descriptions to augment a knowledge graph with degree-skill relationships (pdf)
Natalia Khaidanova, 2023. Machine-Translation Evaluation: Comparing Traditional and Neural Machine-Translation Evaluation Metrics for English→Russian (pdf)
Noah-Manuel Michael, 2023. Automated Verb Order Error Detection for Learners of Dutch as a Second Language (pdf)
Quincy Liem, 2023. On the limits of entity linking on domain-specific data (pdf)
Siti Nurhalimah, 2023. Enhancing Wordnet Bahasa through Multilingual Sense Intersection (pdf)
Sofia Lee, 2023. Incident in Zagreb: self-supervised task adaptation performed: Impact of Task Adapting on Transformer Models for Targeted Sentiment Analysis in Croatian Headlines (pdf)
Swarupa Hardikar, 2023. Exploring Open-source Generative Models for Lexical Simplification through Prompt Learning (pdf)
Anouk Twilt, 2022. Sustainability in action: exploring automatically extracting actions from news-articles
Elena Weber, 2022. Automatic Topic Classification of Customer Feedback in the Banking Domain (pdf)
Ellemijn Galjaard, 2022. Evaluating Transfer of a Functional Level Classifier from Secondary to Primary Healthcare Notes
Felix den Heijer, 2022. NER Classification for old and modern Dutch biographies: A comparative study of finetuned BERT models and out-of-the-box tools (pdf)
Giorgio Malinverni, 2022. Analysing the Influence of Morphological Characteristics on the Performance of Few-Shot Prompting for Natural Language Inference in Cross-Lingual Settings
Jingyue Zhang, 2022. Mapping text to learning objectives: A keyword-based text classification method
Konstantina Andronikou, 2022. Automatic Retrieval of Topics Using Topic Modeling Techniques from Customer Conversations in the Airline Domain
Lahorka Nikolovski, 2022. Synthetic Data for Domain Adaptation in Neural Machine Translation (pdf)
Lois Rink, 2022. Automatic Classification of Speech Acts in tax service letters
Michiel van Nederpelt, 2022. Evaluating a transformer-based language model under increasingly challenging conditions for the task of offensive language detection
Mira Reisiger, 2022. Context-based entity linking of biomedical text
Myrthe Buckens, 2022. Comparing and Evaluating Language Models for Conversational Data from the Medical Domain.
Sharona Badloe, 2022. MedRoBERTa.nl: Transfer Learning From COVID-19 to Cancer Patients
Shuyi Shen, 2022. Data to text generation with a joint entity and relation based method for a job advertisement
Sylvia Pronk, 2022. A detailed comparison between two coreference systems and their effect on key-sentence extraction
Tessel Wisman, 2022. Domain adaptation of end-to-end ASR via n-gram language modelling (pdf)
Yan Chung Li, 2022. A Challenge Set for Natural Language Inference on but-inferred propositions
Aju Shreshta, 2021. BERTje-based Automatic Anonymisation of Dutch Police Reports (pdf)
Breta Micha, 2021. Automatic Terminology Extraction in domain specific texts: a comparison between a rule-based system and a BERT-based system.
Dyon van der Ende, 2021. Text Mining for Sustainability: Detecting Corporate Greenwashing with the Sustainable Development Goals
Eva den Uijl, 2021. Detecting Discriminatory Language in Job Advertising Texts
Gabriele Catanese, 2021. A Transfer Learning approach to Aspect Based Sentiment Analysis for airline customer feedbacks
Guido Ansem, 2021. The Effect of Auxiliary Data on Low Resource Languages in Aspect Extraction
Jasmine van Vugt, 2021. Two Dutch fine-tuned BERT models: Named Entity Recognition and Named Entity Linking to increase findability of local geographical information. (pdf)
Melisha Lemain-van der Nest, 2021. Named Entity Recognition: identifying NER Indicators in Dutch Police Reports
Michelle Chan, 2021. An Empirical Framework for Topic Modelling for Dutch Texts based on Newspaper Articles on Soil Pollution
Sanne Hamersma, 2021. Explorative analysis of precursors of physical aggression in a health care institute: a Text Mining approach
Stan Frinking, 2021. Using Text Mining Techniques to Detect Fall Events in Medical Patient Notes
Eva Zegelaar, 2020. An Automatic Emotion & Purpose Classifier for Dutch Tweets Written by Members of the Dutch Parliament
Jan van Casteren, 2020. Automatic Attribution Extraction From Dutch News Articles: A Beginning
Luca Meima, 2020. Finding potentially HIV defining conditions in medical reports
Peter Caine, 2020. Mind the gap: A comparison of linguistic vs deep-learning approaches to aspect extraction and aspect category detection