TY - JOUR
T1 - Optimum search strategies or novel 3D molecular descriptors
T2 - Is there a stalemate?
AU - Marrero-Ponce, Yovani
AU - García-Jacas, César R.
AU - Barigye, Stephen J.
AU - Valdés-Martiní, José R.
AU - Rivera-Borroto, Oscar Miguel
AU - Pino-Urias, Ricardo W.
AU - Cubillán, Néstor
AU - Alvarado, Ysaías J.
AU - Le-Thi-Thu, Huong
N1 - Publisher Copyright:
© 2015 Bentham Science Publishers.
PY - 2015/12/1
Y1 - 2015/12/1
N2 - The present manuscript describes a novel 3D-QSAR alignment free method (QuBiLS-MIDAS Duplex) based on algebraic bilinear, quadratic and linear forms on the kth two-tuple spatial-(dis)similarity matrix. Generalization schemes for the inter-atomic spatial distance using diverse (dis)-similarity measures are discussed. On the other hand, normalization approaches for the two-tuple spatial-(dis)similarity matrix by using simple-and double-stochastic and mutual probability schemes are introduced. With the aim of taking into consideration particular inter-atomic interactions in total or local-fragment indices, path and length cut-off constraints are used. Also, in order to generalize the use of the linear combination of atom-level indices to yield global (molecular) definitions, a set of aggregation operators (invariants) are applied. A Shannon’s entropy based variability study for the proposed 3D algebraic form-based indices and the DRAGON molecular descriptor families demonstrates superior performance for the former. A principal component analysis reveals that the novel indices codify structural information orthogonal to those captured by the DRAGON indices. Finally, a QSAR study for the binding affinity to the corticosteroid-binding globulin using Cramer’s steroid database is performed. From this study, it is revealed that the QuBiLS-MIDAS Duplex approach yields similar-to-superior performance statistics than all the 3D-QSAR methods reported in the literature reported so far, even with lower degree of freedom, using both the 31 steroids as the training set and the popular division of Cramer’s database in training [1-21] and test sets [22-31]. It is thus expected that this methodology provides useful tools for the diversity analysis of compound datasets and high-throughput screening structure–activity data.
AB - The present manuscript describes a novel 3D-QSAR alignment free method (QuBiLS-MIDAS Duplex) based on algebraic bilinear, quadratic and linear forms on the kth two-tuple spatial-(dis)similarity matrix. Generalization schemes for the inter-atomic spatial distance using diverse (dis)-similarity measures are discussed. On the other hand, normalization approaches for the two-tuple spatial-(dis)similarity matrix by using simple-and double-stochastic and mutual probability schemes are introduced. With the aim of taking into consideration particular inter-atomic interactions in total or local-fragment indices, path and length cut-off constraints are used. Also, in order to generalize the use of the linear combination of atom-level indices to yield global (molecular) definitions, a set of aggregation operators (invariants) are applied. A Shannon’s entropy based variability study for the proposed 3D algebraic form-based indices and the DRAGON molecular descriptor families demonstrates superior performance for the former. A principal component analysis reveals that the novel indices codify structural information orthogonal to those captured by the DRAGON indices. Finally, a QSAR study for the binding affinity to the corticosteroid-binding globulin using Cramer’s steroid database is performed. From this study, it is revealed that the QuBiLS-MIDAS Duplex approach yields similar-to-superior performance statistics than all the 3D-QSAR methods reported in the literature reported so far, even with lower degree of freedom, using both the 31 steroids as the training set and the popular division of Cramer’s database in training [1-21] and test sets [22-31]. It is thus expected that this methodology provides useful tools for the diversity analysis of compound datasets and high-throughput screening structure–activity data.
KW - 3D-QSAR
KW - Aggregation operator
KW - Alignment free method
KW - Minkowski distance matrix
KW - Principal component analysis
KW - QuBiLS-MIDAS
KW - TOMOCOMD-CARDD
KW - Two-tuple spatial-(dis)similarity matrix
KW - Variability analysis
UR - http://www.scopus.com/inward/record.url?scp=84927733368&partnerID=8YFLogxK
U2 - 10.2174/1574893610666151008011457
DO - 10.2174/1574893610666151008011457
M3 - Artículo
AN - SCOPUS:84927733368
SN - 1574-8936
VL - 10
SP - 533
EP - 564
JO - Current Bioinformatics
JF - Current Bioinformatics
IS - 5
ER -