Skip to main navigation Skip to search Skip to main content

A Transformer Model for Manifesto Classification Using Cross-Context Training: An Ecuadorian Case Study

  • Universidad San Francisco de Quito
  • University of South Florida

Research output: Contribution to journalArticlepeer-review

Abstract

Content analysis of political manifestos is necessary to understand the policies and proposed actions of a party. However, manually labeling political texts is time-consuming and labor-intensive. Transformer networks have become essential tools for automating this task. Nevertheless, these models require extensive datasets to achieve good performance. This can be a limitation in manifesto classification, where the availability of publicly labeled datasets can be scarce. To address this challenge, in this work, we developed a Transformer network for the classification of manifestos using a cross-domain training strategy. Using the database of the Comparative Manifesto Project, we implemented a fractional factorial experimental design to determine which Spanish-written manifestos form the best training set for Ecuadorian manifesto labeling. Furthermore, we statistically analyzed which Transformer architecture and preprocessing operations improve the model accuracy. The results indicate that creating a training set with manifestos from Spain and Uruguay, along with implementing stemming and lemmatization preprocessing operations, produces the highest classification accuracy. In addition, we found that the DistilBERT and RoBERTa transformer networks perform statistically similarly and consistently well in manifesto classification. Using the cross-context training strategy, DistilBERT and RoBERTa achieve 60.05% and 57.64% accuracy, respectively, in the classification of the Ecuadorian manifesto. Finally, we investigated the effect of the composition of the training set on performance. The experiments demonstrate that training DistilBERT solely with Ecuadorian manifestos achieves the highest accuracy and F1-score. Furthermore, in the absence of the Ecuadorian dataset, competitive performance is achieved by training the model with datasets from Spain and Uruguay.

Original languageEnglish
Pages (from-to)578-603
Number of pages26
JournalSocial Science Computer Review
Volume43
Issue number3
DOIs
StatePublished - Jun 2025

Keywords

  • DistilBERT
  • RoBERTa
  • Transformer networks
  • cross-context training
  • manifesto classification
  • natural language processing

Fingerprint

Dive into the research topics of 'A Transformer Model for Manifesto Classification Using Cross-Context Training: An Ecuadorian Case Study'. Together they form a unique fingerprint.

Cite this