Abstract
Content analysis of political manifestos is necessary to understand the policies and proposed actions of a party. However, manually labeling political texts is time-consuming and labor-intensive. Transformer networks have become essential tools for automating this task. Nevertheless, these models require extensive datasets to achieve good performance. This can be a limitation in manifesto classification, where the availability of publicly labeled datasets can be scarce. To address this challenge, in this work, we developed a Transformer network for the classification of manifestos using a cross-domain training strategy. Using the database of the Comparative Manifesto Project, we implemented a fractional factorial experimental design to determine which Spanish-written manifestos form the best training set for Ecuadorian manifesto labeling. Furthermore, we statistically analyzed which Transformer architecture and preprocessing operations improve the model accuracy. The results indicate that creating a training set with manifestos from Spain and Uruguay, along with implementing stemming and lemmatization preprocessing operations, produces the highest classification accuracy. In addition, we found that the DistilBERT and RoBERTa transformer networks perform statistically similarly and consistently well in manifesto classification. Using the cross-context training strategy, DistilBERT and RoBERTa achieve 60.05% and 57.64% accuracy, respectively, in the classification of the Ecuadorian manifesto. Finally, we investigated the effect of the composition of the training set on performance. The experiments demonstrate that training DistilBERT solely with Ecuadorian manifestos achieves the highest accuracy and F1-score. Furthermore, in the absence of the Ecuadorian dataset, competitive performance is achieved by training the model with datasets from Spain and Uruguay.
| Original language | English |
|---|---|
| Pages (from-to) | 578-603 |
| Number of pages | 26 |
| Journal | Social Science Computer Review |
| Volume | 43 |
| Issue number | 3 |
| DOIs | |
| State | Published - Jun 2025 |
Keywords
- DistilBERT
- RoBERTa
- Transformer networks
- cross-context training
- manifesto classification
- natural language processing
Fingerprint
Dive into the research topics of 'A Transformer Model for Manifesto Classification Using Cross-Context Training: An Ecuadorian Case Study'. Together they form a unique fingerprint.Press/Media
-
Universidad San Francisco de Quito Researcher Updates Current Data on Social Science and Computers (A Transformer Model for Manifesto Classification Using Cross-Context Training: An Ecuadorian Case Study)
Riofrío, D., Benítez, D., Baldeón Calisto, M., Navarrete, D., Flores Moyano, R., Medina Pérez, P. & Pérez, N.
9/08/24
1 item of Media coverage
Press/Media
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver