TY - JOUR
T1 - Shannon's, mutual, conditional and joint entropy information indices
T2 - Generalization of global indices defined from local vertex invariants
AU - Barigye, Stephen J.
AU - Marrero-Ponce, Yovani
AU - Martínez Santiago, Oscar
AU - Martínez López, Yoan
AU - Pérez-Giménez, Facundo
AU - Torrens, Francisco
PY - 2013
Y1 - 2013
N2 - A new mathematical approach is proposed in the definition of molecular descriptors (MDs) based on the application of information theory concepts. This approach stems from a new matrix representation of a molecular graph (G) which is derived from the generalization of an incidence matrix whose row entries correspond to connected subgraphs of a given G, and the calculation of the Shannon's entropy, the negentropy and the standardized information content, plus for the first time, the mutual, conditional and joint entropy-based MDs associated with G. We also define strategies that generalize the definition of global or local invariants from atomic contributions (local vertex invariants, LOVIs), introducing related metrics (norms), means and statistical invariants. These invariants are applied to a vector whose components express the atomic information content calculated using the Shannon's, mutual, conditional and joint entropy-based atomic information indices. The novel information indices (IFIs) are implemented in the program TOMOCOMD-CARDD. A principal component analysis reveals that the novel IFIs are capable of capturing structural information not codified by IFIs implemented in the software DRAGON. A comparative study of the different parameters (e.g. subgraph orders and/or types, invariants and class of MDs) used in the definition of these IFIs reveals several interesting results. The mutual entropy-based indices give the best correlation results in modeling of a physicochemical property, namely the partition coefficient of the 34 derivatives of 2-furylethylenes, among the classes of indices investigated in this study. In a comparison with classical MDs it is demonstrated that the new IFIs give good results for various QSPR models.
AB - A new mathematical approach is proposed in the definition of molecular descriptors (MDs) based on the application of information theory concepts. This approach stems from a new matrix representation of a molecular graph (G) which is derived from the generalization of an incidence matrix whose row entries correspond to connected subgraphs of a given G, and the calculation of the Shannon's entropy, the negentropy and the standardized information content, plus for the first time, the mutual, conditional and joint entropy-based MDs associated with G. We also define strategies that generalize the definition of global or local invariants from atomic contributions (local vertex invariants, LOVIs), introducing related metrics (norms), means and statistical invariants. These invariants are applied to a vector whose components express the atomic information content calculated using the Shannon's, mutual, conditional and joint entropy-based atomic information indices. The novel information indices (IFIs) are implemented in the program TOMOCOMD-CARDD. A principal component analysis reveals that the novel IFIs are capable of capturing structural information not codified by IFIs implemented in the software DRAGON. A comparative study of the different parameters (e.g. subgraph orders and/or types, invariants and class of MDs) used in the definition of these IFIs reveals several interesting results. The mutual entropy-based indices give the best correlation results in modeling of a physicochemical property, namely the partition coefficient of the 34 derivatives of 2-furylethylenes, among the classes of indices investigated in this study. In a comparison with classical MDs it is demonstrated that the new IFIs give good results for various QSPR models.
KW - Conditional entropy
KW - Frequency matrix
KW - Joint entropy
KW - Mutual entropy
KW - Principal component analysis
KW - QSPR
KW - Shannon's entropy
KW - Structural descriptor
KW - Subgraph
UR - http://www.scopus.com/inward/record.url?scp=84888047264&partnerID=8YFLogxK
U2 - 10.2174/1573409911309020003
DO - 10.2174/1573409911309020003
M3 - Artículo
C2 - 23700990
AN - SCOPUS:84888047264
SN - 1573-4099
VL - 9
SP - 164
EP - 183
JO - Current Computer-Aided Drug Design
JF - Current Computer-Aided Drug Design
IS - 2
ER -