TY - GEN
T1 - A Zoom into Ecuadorian Politics
T2 - 13th IEEE International Conference on Pattern Recognition Systems, ICPRS 2023
AU - Barzallo, Fernanda
AU - Moscoso, María Emilia
AU - Pérez, Margorie
AU - Baldeon-Calisto, María
AU - Navarrete, Danny
AU - Riofrío, Daniel
AU - Medina-Pérez, Pablo
AU - Lai-Yuen, Susana K.
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023/7/4
Y1 - 2023/7/4
N2 - Political science research on party manifestos helps to understand a candidate’s strategies and proposed actions during electoral campaigns. To achieve this, political scientists classify a manifesto’s sentences and quasi-sentences into seven main domains established by the Comparative Manifesto Project. However, manually coding is a time-consuming and labor-intensive task that can lead to biases. Automatically classifying manifestos has shown to produce good and reproducible annotations. In Ecuador, research on automatic manifesto analysis has been limited. Moreover, there is no large labeled Ecuadorian corpus available for training. Therefore, in this work we develop a Transformer network for automatically analyzing Ecuadorian manifestos using a cross-domain training approach. We implement a fractional factorial experimental design to determine which Transformer model, type of pre-processing operations, and Spanish text data should be used to maximize the accuracy of the classification model. The results show that the DistilBERT architecture trained with Mexico’s and Argentina´s manifestos increase the classification accuracy. Without using an Ecuadorian corpus for training, the implemented DistilBERT achieves a 44% accuracy on the Ecuadorian test set, which has a comparable performance to other models in literature trained with a Spanish corpus.
AB - Political science research on party manifestos helps to understand a candidate’s strategies and proposed actions during electoral campaigns. To achieve this, political scientists classify a manifesto’s sentences and quasi-sentences into seven main domains established by the Comparative Manifesto Project. However, manually coding is a time-consuming and labor-intensive task that can lead to biases. Automatically classifying manifestos has shown to produce good and reproducible annotations. In Ecuador, research on automatic manifesto analysis has been limited. Moreover, there is no large labeled Ecuadorian corpus available for training. Therefore, in this work we develop a Transformer network for automatically analyzing Ecuadorian manifestos using a cross-domain training approach. We implement a fractional factorial experimental design to determine which Transformer model, type of pre-processing operations, and Spanish text data should be used to maximize the accuracy of the classification model. The results show that the DistilBERT architecture trained with Mexico’s and Argentina´s manifestos increase the classification accuracy. Without using an Ecuadorian corpus for training, the implemented DistilBERT achieves a 44% accuracy on the Ecuadorian test set, which has a comparable performance to other models in literature trained with a Spanish corpus.
KW - DistilBERT
KW - Manifesto text Classification
KW - Natural Language Processing
KW - RoBERTa
KW - Transformer Networks
UR - http://www.scopus.com/inward/record.url?scp=85170827637&partnerID=8YFLogxK
U2 - 10.1109/ICPRS58416.2023.10213340
DO - 10.1109/ICPRS58416.2023.10213340
M3 - Contribución a la conferencia
AN - SCOPUS:85170827637
T3 - 2023 IEEE 13th International Conference on Pattern Recognition Systems (ICPRS)
BT - 2023 IEEE 13th International Conference on Pattern Recognition Systems, ICPRS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 4 July 2023 through 7 July 2023
ER -