Towards a Mixed Learning Strategy for Discovering New Gene Signatures in Breast Cancer Prognosis

Cristhian Cola-Pilicita, Mateo Martínez-Mejía, Eduardo Alba, Yovani Marrero-Ponce, Noel Pérez-Pérez

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This work focuses on developing a mixed-learning method that combines a filter-based metaheuristic searcher with a shallow learning classifier to reduce the feature space while maximizing the breast cancer prognosis classification. The searcher used a genetic algorithm together with the average symmetrical uncertainty (aSU) and ReliefF (aReliefF) filter functions. This modification allowed us to measure the relevance per capita of a group of features (genes). The proposed method was validated on a data set with 396 instances. The most effective classification scheme emerged from the random forest model, utilizing 60 tree predictors and employing the aReliefF objective function. This configuration achieved an average area under the receiver operating characteristic curve (AUC) score of 0.854 and 0.874 for the training and test stages, respectively. Thus, this classification scheme is the best breast cancer prognosis classification strategy. In addition, we identified a set of master genes through the intersection of both objective functions regarding feature relevance. Nevertheless, evaluating this subset in the test set using the top-performing classification scheme yielded a comparatively lower performance (AUC=0.829), underscoring the necessity for additional genes to maximize classification effectiveness.

Original languageEnglish
Title of host publication2024 7th IEEE Biennial Congress of Argentina, ARGENCON 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350365931
DOIs
StatePublished - 2024
Event7th IEEE Biennial Congress of Argentina, ARGENCON 2024 - San Nicolas de los Arroyos, Argentina
Duration: 18 Sep 202420 Sep 2024

Publication series

Name2024 7th IEEE Biennial Congress of Argentina, ARGENCON 2024

Conference

Conference7th IEEE Biennial Congress of Argentina, ARGENCON 2024
Country/TerritoryArgentina
CitySan Nicolas de los Arroyos
Period18/09/2420/09/24

Keywords

  • Genetic algorithm
  • Metaheuristics
  • Naive Bayes
  • Random forest
  • ReliefF
  • Shallow learning
  • Symmetrical uncertainty
  • k-nearest neighbors

Fingerprint

Dive into the research topics of 'Towards a Mixed Learning Strategy for Discovering New Gene Signatures in Breast Cancer Prognosis'. Together they form a unique fingerprint.

Cite this