Abstract
This dissertation develops and validates an end-to-end Natural Language Processing (NLP) and Machine Learning (ML) pipeline for Topic Classification (TC) and Topic-Based Sentiment Analysis (TBSA) on Modern Greek interview data collected during Emergency Remote Teaching (ERT). The research focuses on K-12 education and emphasizes on students with functional diversity, incorporating perspectives from parents of students with functional diversity, school directors, and teachers.The study follows a mixed approach. First, it conducts qualitative descriptive and semantic-content analysis to preserve the richness of human narratives. Then, it implements a computational pipeline that includes corpus construction, sentence-level annotation into topics, sentiment labeling, Greek-specific text preprocessing, feature representation, and supervised modeling. The pipeline supports both classical learners and state of the art transformer-based language models and is designed to be reproducible and ex ...
This dissertation develops and validates an end-to-end Natural Language Processing (NLP) and Machine Learning (ML) pipeline for Topic Classification (TC) and Topic-Based Sentiment Analysis (TBSA) on Modern Greek interview data collected during Emergency Remote Teaching (ERT). The research focuses on K-12 education and emphasizes on students with functional diversity, incorporating perspectives from parents of students with functional diversity, school directors, and teachers.The study follows a mixed approach. First, it conducts qualitative descriptive and semantic-content analysis to preserve the richness of human narratives. Then, it implements a computational pipeline that includes corpus construction, sentence-level annotation into topics, sentiment labeling, Greek-specific text preprocessing, feature representation, and supervised modeling. The pipeline supports both classical learners and state of the art transformer-based language models and is designed to be reproducible and extensible. Particular attention is paid to the linguistic characteristics of Modern Greek and to the principles of transparent annotation with clear guidelines and adjudication. The contributions are threefold. It provides original, topic and sentiment annotated datasets in Modern Greek across educational stakeholder groups. It proposes a practical, end-to-end pipeline that integrates qualitative inquiry with automated analysis forlow-resource languages. It offers empirical evidence that topic-aware modeling yields actionable insights about the educational, technical, and psychological dimensions of ERT for inclusive education. The pipeline is presented as a coherent sequence of stages that, when followed, enables robust topic identification and topic-based sentiment interpretation at scale. The work aims to inform evidence-based decisions on digital preparedness, pedagogical design, and support policies for learners with diverse needs. Each stage of the methodology has been disseminated through peer-reviewed publications by the Candidate, and the dissertation consolidates these advances into a unified framework.Building on this foundation, the study articulated focused research questions that mapped the educational, technical, and affective topics of ERT, and operationalized them through an integrated NLP and ML strategy suited to a low-resource language setting. The corpus consisted of Modern Greek interview transcripts that were carefully transcribed, anonymized, and segmented so that the text body preserved narrative nuance while enabling sentence-level modeling. Annotation guidelines were drafted with explicit decision rules and adjudication, and inter-annotator agreement was assessed to support reliability in both TC and TBSA. Greek-oriented preprocessing ensured compatibility with thelinguistic structure of Modern Greek, and feature representations supported both classical and contextual modeling approaches. The supervised modeling layer combined classicalML (e.g., linear classifiers and tree-based learners) with Deep Learning via transformer language models suitable for Greek, enabling a comparative evaluation of interpretability and predictive performance. Evaluation adopted standard splits and cross-validated estimates, and reported metrics for both TC and TBSA, complemented by Confusion Matrices and threshold-aware diagnostics that clarified error patterns. Results showed that topic-based sentiment modeling improved the interpretability of stakeholder narratives and yielded stable gains for the corpus, particularly in segments where educational practices, digitalinfrastructure, and psychological burden intersected during ERT. The analysis highlighted needs for accessible platforms, differentiated pedagogical design, and targeted support for learners with functional diversity, while also surfacing the role of teacher training and family mediation as recurrent themes. The study documented all preprocessing choices, model configurations, and evaluation artifacts to promote reproducibility and extension to adjacent educational corpora. Ethical safeguards guided all stages of the study: informed consent, strict anonymization, secure storage, and transparent documentation of the annotation process. In sum, the dissertation delivered a reproducible, extensible pipeline that bridged qualitative inquiry with automated analysis, advanced TC and TBSA for Modern Greek educational interviews, and provided evidence for inclusive education and digital preparedness. By consolidating peer-reviewed components into a single framework and releasing topic and sentiment annotated datasets, the work created a practical path from corpus construction to deployable insight for policy and pedagogy in K-12 education.
show more