Affiliation: Chemical Engineering Division, CSIR-National Chemical Laboratory, Pune – 411008 India.
An accurate prediction of the pharmacokinetic properties of orally administered drugs is of paramount importance in pharmaceutical industry. Caco-2 cell permeability is a well established parameter for assessing the drug absorption profiles of lead molecules. Due to the restrictions on animal testing, prohibitive in situ models and ethical issues, the development of predictive models is essential. Genetic programming (GP) is an artificial intelligence (AI)-based exclusively data driven modeling paradigm. Given an example input-output data, it searches and optimizes, both the structure and parameters of a well fitting linear/non-linear input-output model. Despite this novelty, GP has not been widely exploited in drug design. Accordingly, in this study we propose a GP based approach for the in silico prediction of Caco-2 cell permeability using a diverse set of molecules. The predictions yielded a high magnitude for the training and test set correlation coefficient with low RMSE, indicating accurate Caco-2 permeability prediction and generalization performance by the GP model. The predictions were better or comparable to artificial neural networks (ANN) and support vector regression (SVR) methods. The GP based modeling approach illustrated will find diverse applications in (QSAR, QSPR and QSTR) modeling for the virtual screening of large libraries.