TY - JOUR
T1 - A novel approach to predict aquatic toxicity from molecular structure
AU - Castillo-Garit, Juan A.
AU - Marrero-Ponce, Yovani
AU - Escobar, Jeanette
AU - Torrens, Francisco
AU - Rotondo, Richard
N1 - Funding Information:
We sincerely thank Dr. T.W. Schultz for providing some manuscript reprints from his works, which significantly contribute to the development of this paper. Castillo-Garit thanks the program ‘Estades Temporals per a Investigadors Convidats’ for a fellowship to work at Valencia University in 2008.
PY - 2008/9
Y1 - 2008/9
N2 - The main aim of the study was to develop quantitative structure-activity relationship (QSAR) models for the prediction of aquatic toxicity using atom-based non-stochastic and stochastic linear indices. The used dataset consist of 392 benzene derivatives, separated into training and test sets, for which toxicity data to the ciliate Tetrahymena pyriformis were available. Using multiple linear regression, two statistically significant QSAR models were obtained with non-stochastic (R2 = 0.791 and s = 0.344) and stochastic (R2 = 0.799 and s = 0.343) linear indices. A leave-one-out (LOO) cross-validation procedure was carried out achieving values of q2 = 0.781 (scv = 0.348) and q2 = 0.786 (scv = 0.350), respectively. In addition, a validation through an external test set was performed, which yields significant values of Rpred2 of 0.762 and 0.797. A brief study of the influence of the statistical outliers in QSAR's model development was also carried out. Finally, our method was compared with other approaches implemented in the Dragon software achieving better results. The non-stochastic and stochastic linear indices appear to provide an interesting alternative to costly and time-consuming experiments for determining toxicity.
AB - The main aim of the study was to develop quantitative structure-activity relationship (QSAR) models for the prediction of aquatic toxicity using atom-based non-stochastic and stochastic linear indices. The used dataset consist of 392 benzene derivatives, separated into training and test sets, for which toxicity data to the ciliate Tetrahymena pyriformis were available. Using multiple linear regression, two statistically significant QSAR models were obtained with non-stochastic (R2 = 0.791 and s = 0.344) and stochastic (R2 = 0.799 and s = 0.343) linear indices. A leave-one-out (LOO) cross-validation procedure was carried out achieving values of q2 = 0.781 (scv = 0.348) and q2 = 0.786 (scv = 0.350), respectively. In addition, a validation through an external test set was performed, which yields significant values of Rpred2 of 0.762 and 0.797. A brief study of the influence of the statistical outliers in QSAR's model development was also carried out. Finally, our method was compared with other approaches implemented in the Dragon software achieving better results. The non-stochastic and stochastic linear indices appear to provide an interesting alternative to costly and time-consuming experiments for determining toxicity.
KW - Atom-based non-stochastic and stochastic linear index
KW - Multiple linear regression
KW - Program TOMOCOMD-CARDD
KW - QSAR
KW - Tetrahymena pyriformis
UR - http://www.scopus.com/inward/record.url?scp=50049105424&partnerID=8YFLogxK
U2 - 10.1016/j.chemosphere.2008.05.024
DO - 10.1016/j.chemosphere.2008.05.024
M3 - Artículo
C2 - 18597811
AN - SCOPUS:50049105424
SN - 0045-6535
VL - 73
SP - 415
EP - 427
JO - Chemosphere
JF - Chemosphere
IS - 3
ER -