Skip to main navigation Skip to search Skip to main content

IMMAN: free software for information theory-based chemometric analysis

  • Ricardo W.Pino Urias
  • , Stephen J. Barigye
  • , Yovani Marrero-Ponce*
  • , César R. García-Jacas
  • , José R. Valdes-Martiní
  • , Facundo Perez-Gimenez
  • *Corresponding author for this work
  • Unit of Computer-Aided Molecular “Biosilico” Discovery and Bioinformatic Research (CAMD-BIR International)
  • Universidad Central Marta Abreu de Las Villas
  • Universidade Federal de Lavras
  • Universitat de València
  • Universidad Tecnológica de Bolívar
  • Universidad de las Ciencias Informáticas

Research output: Contribution to journalArticlepeer-review

54 Scopus citations

Abstract

Abstract: The features and theoretical background of a new and free computational program for chemometric analysis denominated IMMAN (acronym for Information theory-based CheMoMetrics ANalysis) are presented. This is multi-platform software developed in the Java programming language, designed with a remarkably user-friendly graphical interface for the computation of a collection of information-theoretic functions adapted for rank-based unsupervised and supervised feature selection tasks. A total of 20 feature selection parameters are presented, with the unsupervised and supervised frameworks represented by 10 approaches in each case. Several information-theoretic parameters traditionally used as molecular descriptors (MDs) are adapted for use as unsupervised rank-based feature selection methods. On the other hand, a generalization scheme for the previously defined differential Shannon’s entropy is discussed, as well as the introduction of Jeffreys information measure for supervised feature selection. Moreover, well-known information-theoretic feature selection parameters, such as information gain, gain ratio, and symmetrical uncertainty are incorporated to the IMMAN software (http://mobiosd-hub.com/imman-soft/), following an equal-interval discretization approach. IMMAN offers data pre-processing functionalities, such as missing values processing, dataset partitioning, and browsing. Moreover, single parameter or ensemble (multi-criteria) ranking options are provided. Consequently, this software is suitable for tasks like dimensionality reduction, feature ranking, as well as comparative diversity analysis of data matrices. Simple examples of applications performed with this program are presented. A comparative study between IMMAN and WEKA feature selection tools using the Arcene dataset was performed, demonstrating similar behavior. In addition, it is revealed that the use of IMMAN unsupervised feature selection methods improves the performance of both IMMAN and WEKA supervised algorithms.

Original languageEnglish
Pages (from-to)305-319
Number of pages15
JournalMolecular Diversity
Volume19
Issue number2
DOIs
StatePublished - 1 May 2015
Externally publishedYes

Keywords

  • Chemometric analysis
  • Classification
  • Computational program
  • Feature selection
  • IMMAN
  • Information-theoretic function

Fingerprint

Dive into the research topics of 'IMMAN: free software for information theory-based chemometric analysis'. Together they form a unique fingerprint.

Cite this