TY - JOUR
T1 - N-linear algebraic maps for chemical structure codification
T2 - A suitable generalization for atom-pair approaches?
AU - García-Jacas, César R.
AU - Marrero-Ponce, Yovani
AU - Barigye, Stephen J.
AU - Valdés-Martiní, José R.
AU - Rivera-Borroto, Oscar M.
AU - Olivero-Verbel, Jesús
N1 - Publisher Copyright:
© 014 Bentham Science Publishers.
PY - 2014/1/1
Y1 - 2014/1/1
N2 - The present manuscript introduces, for the first time, a novel 3D-QSAR alignment free method (QuBiLS-MIDAS) based on tensor concepts through the use of the three-linear and four-linear algebraic forms as specific cases of n-linear maps. To this end, the kth three-tuple and four-tuple spatial-(dis)similarity matrices are defined, as tensors of order 3 and 4, respectively, to represent 3Dinformation among "three and four" atoms of the molecular structures. Several measures (multi-metrics) to establish (dis)-similarity relations among "three and four" atoms are discussed, as well as, normalization schemes proposed for the n-tuple spatial-(dis)similarity matrices based on the simple-stochastic and mutual probability algebraic transformations. To consider specific interactions among atoms, both for the global and local indices, n-tuple path and length cut-off constraints are introduced. This algebraic scaffold can also be seen as a generalization of the vector-matrix-vector multiplication procedure (which is a matrix representation of the traditional linear, quadratic and bilinear forms) for the calculation of molecular descriptors and is thus a new theoretical approach with a methodological contribution. A variability analysis based on Shannon's entropy reveals that the best distributions are achieved with the ternary and quaternary measures corresponding to the bond and dihedral angles. In addition, the proposed indices have superior entropy behavior than the descriptors calculated by other programs used in chemo-informatics studies, such as, DRAGON, PADEL, Mold2, and so on. A principal component analysis shows that the novel 3D n-tuple indices codify the same information captured by the DRAGON 3D-indices, as well as, information not codified by the latter. A QSAR study to obtain deeper criteria on the contribution of the novel molecular parameters was performed for the binding affinity to the corticosteroid-binding globulin, using Cramer's steroid database. The achieved results reveal superior statistical parameters for the Bond Angle and Dihedral Angle approaches, consistent with the results obtained in variability analysis. Finally, the obtained QuBiLS-MIDAS models yield superior performances than all 3D-QSAR methods reported in the literature using the 31 steroids as training set, and for the popular division of Cramer's database in training (1-21) and test (22-31) sets, comparable to superior results in the prediction of the activity of the steroids are obtained. From the results achieved, it can be suggested that the proposed QuBiLS-MIDAS N-tuples indices are a useful tool to be considered in chemo-informatics studies.
AB - The present manuscript introduces, for the first time, a novel 3D-QSAR alignment free method (QuBiLS-MIDAS) based on tensor concepts through the use of the three-linear and four-linear algebraic forms as specific cases of n-linear maps. To this end, the kth three-tuple and four-tuple spatial-(dis)similarity matrices are defined, as tensors of order 3 and 4, respectively, to represent 3Dinformation among "three and four" atoms of the molecular structures. Several measures (multi-metrics) to establish (dis)-similarity relations among "three and four" atoms are discussed, as well as, normalization schemes proposed for the n-tuple spatial-(dis)similarity matrices based on the simple-stochastic and mutual probability algebraic transformations. To consider specific interactions among atoms, both for the global and local indices, n-tuple path and length cut-off constraints are introduced. This algebraic scaffold can also be seen as a generalization of the vector-matrix-vector multiplication procedure (which is a matrix representation of the traditional linear, quadratic and bilinear forms) for the calculation of molecular descriptors and is thus a new theoretical approach with a methodological contribution. A variability analysis based on Shannon's entropy reveals that the best distributions are achieved with the ternary and quaternary measures corresponding to the bond and dihedral angles. In addition, the proposed indices have superior entropy behavior than the descriptors calculated by other programs used in chemo-informatics studies, such as, DRAGON, PADEL, Mold2, and so on. A principal component analysis shows that the novel 3D n-tuple indices codify the same information captured by the DRAGON 3D-indices, as well as, information not codified by the latter. A QSAR study to obtain deeper criteria on the contribution of the novel molecular parameters was performed for the binding affinity to the corticosteroid-binding globulin, using Cramer's steroid database. The achieved results reveal superior statistical parameters for the Bond Angle and Dihedral Angle approaches, consistent with the results obtained in variability analysis. Finally, the obtained QuBiLS-MIDAS models yield superior performances than all 3D-QSAR methods reported in the literature using the 31 steroids as training set, and for the popular division of Cramer's database in training (1-21) and test (22-31) sets, comparable to superior results in the prediction of the activity of the steroids are obtained. From the results achieved, it can be suggested that the proposed QuBiLS-MIDAS N-tuples indices are a useful tool to be considered in chemo-informatics studies.
KW - 3D Three-linear and four-linear indices
KW - Aggregation operator
KW - Cramer's steroid
KW - N-tuple simple stochastic and mutual probability matrices
KW - N-tuple spatial-(Dis)similarity matrix
KW - Principal component analysis
KW - QSAR
KW - QuBiLS-MIDAS N-tuples
KW - Shannon entropy
KW - TOMOCOMD-CARDD
KW - Variability analysis
UR - http://www.scopus.com/inward/record.url?scp=84921059417&partnerID=8YFLogxK
U2 - 10.2174/1389200215666140605124506
DO - 10.2174/1389200215666140605124506
M3 - Artículo
C2 - 24909423
AN - SCOPUS:84921059417
SN - 1389-2002
VL - 15
SP - 441
EP - 469
JO - Current Drug Metabolism
JF - Current Drug Metabolism
IS - 4
ER -