Measurement and ANN prediction of pH-dependent solubility of nitrogen-heterocyclic compounds.

PMID 25985098


Based on the solubility of 25 nitrogen-heterocyclic compounds (NHCs) measured by saturation shake-flask method, artificial neural network (ANN) was employed to the study of the quantitative relationship between the structure and pH-dependent solubility of NHCs. With genetic algorithm-multivariate linear regression (GA-MLR) approach, five out of the 1497 molecular descriptors computed by Dragon software were selected to describe the molecular structures of NHCs. Using the five selected molecular descriptors as well as pH and the partial charge on the nitrogen atom of NHCs (QN) as inputs of ANN, a quantitative structure-property relationship (QSPR) model without using Henderson-Hasselbalch (HH) equation was successfully developed to predict the aqueous solubility of NHCs in different pH water solutions. The prediction model performed well on the 25 model NHCs with an absolute average relative deviation (AARD) of 5.9%, while HH approach gave an AARD of 36.9% for the same model NHCs. It was found that QN played a very important role in the description of NHCs and, with QN, ANN became a potential tool for the prediction of pH-dependent solubility of NHCs.