Elsevier

Computers & Chemistry

Volume 26, Issue 4, June 2002, Pages 327-332
Computers & Chemistry

Improved QSPR analysis of standard entropy of acyclic and aromatic compounds using optimized correlation weights of linear graph invariants

https://doi.org/10.1016/S0097-8485(01)00121-8Get rights and content

Abstract

Entropy is calculated for a representative set of acyclic and aromatic compounds within the realm of QSAR/QSPR theory. Flexible topological molecular indices are chosen as independent variables in the fitting equations. The comparison with results derived from quantum mechanical calculations shows that the present approach gives better predictions. Some possible future extensions are pointed out.

Introduction

Study of Quantitative Structure Activity Relationships (QSAR) and Quantitative Structure Property Relationships (QSPR) continue to attract considerable attention in standard chemical literature. Various statistical methods have been found useful in such studies, such as Multiple Regression Analysis (Malinowski, 1991), Principal Component Analysis (Hotelling, 1933), Pattern Recognition (Wold and Sjostrom, 1977), Partial Least Square Method (Wold et al., 1998), and Artificial Neural Networks (Zupan, 1998). Since Hansch and Fujita have performed the pioneering studies on QSAR/QSPR (Hansch and Fujita, 1964), the advances in this matter have not ceased. The predictive capabilities of the earliest models were substantially improved when new topological descriptors were introduced, providing powerful alternatives to the use of extra-thermodynamical parameters in QSAR/QSPR studies (Randic, 1998).

A topological index merely represents a mathematical property of a structure. The real drawback and the suitable ingenuity in QSAR/QSPR studies are to find, design, or recognize descriptors that parallel molecular properties of interest in chemistry and drug design. Topological indices should have structural interpretation, should have discrimination power, and should correlate with some molecular property or should considerably improve correlations when combining with other descriptors. The alternatives to mathematical descriptors derived from molecular graphs are the traditional physical chemistry descriptors, such as the partition coefficient, the Hammet sigma value, the Taft's parameters, the steric parameter for the proximity of substituents on reaction sites, molar refractivity, polarizability, density, etc., and quantum chemically computed parameters, such as electronic charge indices, frontier orbital energies (HOMO–LUMO), bond orders, etc. It must be pointed out that quantum chemical properties are also molecular descriptors.

Today there are more than 1000 molecular molecular descriptors available (Todeschini, 2000, Basak et al., 2001, Katritzky et al., 1995) one is immediately faced with the decision concerning selection of descriptors. Which and how many are the obvious questions. There are many issues concerning these points (Randic, 1991a) but basically there are three options: (a) select a subset of the best descriptors from a large pool of available descriptors, (b) use a limited set descriptors, and (c) use as few as possible descriptors that are suitably optimized for each particular application. The first option is based on the fact that not any descriptor embodies the whole information related to the activity/property under study. Second and third possibilities resort to a more ‘economic’ choice. However, one is immediately confronted with the question Which are the best descriptors? Evidently, the answer depends upon the property/activity to be analyzed. Option (c) tries to answer to it by means of using the so called ‘flexible descriptors’, where they are chosen in such a way to be specific for each property/activity. Next section deals with this point.

It is well known that standard entropy of substances and molecules is one of the most important thermodynamic properties in chemistry and related sciences (environmental sciences, chemical engineering, biochemistry, etc.). In some cases the measurement of this property involve experimental difficulties. Consequently, it should be necessary to develop a reliable theoretical tool to predict standard entropies. However, the study of this quantity within the realm of QSPR theory have not received a wide attention, although other closely related thermodynamic quantities were studied at a large extent (such as entalphy of formation and free energies). The aim of this work is to make an improved QSPR analysis of standard entropy of acyclic and aromatic compounds using optimized correlation weights of linear graph invariants, comparing the results with those obtained via quantum chemical evaluation.

The paper is organized as follows: next section deals with the concept of flexible descriptors in general and in particular with the specific descriptor we have used to calculate standard entropy of acyclic and aromatic molecules (i.e. optimized correlation weights of linear graph invariants). Then we present the results obtained through three different ways and we compare the results arising from other approximation schemes. After that, we discuss the results and state the conclusions, pointing out the relative advantages of employing this sort of descriptor. Finally, some possible future developments and extensions of this sort of analysis are highlighted.

Section snippets

Flexible topological descriptors

Most difficulties one encounters with regression analysis result from either the use of too many descriptors or descriptors that are highly interrelated. This includes the ‘nightmares’ or the regression equations, the ‘nightmares’ or chaotic selection of descriptors, as well as ambiguities of the criteria employed to select optimal descriptos and uncertainties when choosing the order in which descriptors are to be orthogonalized (Randic, 1991b, Randic, 1991c, Randic, 1993, Randic, 1994, Randic,

Results and discussion

Entropy is a fundamental physicochemical characteristics of substances and molecules. However, its measurements sometimes involve experimental drawbacks and are not always feasible, and the standard methods have substantial restrictions (Vilkov and Pentin, 1987, Stull et al., 1969). Consequently, there is a demand for a theoretical estimation of this property. In a very recent paper, Pankratov (Pankratov, 1999) reported the semiempirical quantum chemical evaluation of thermodynamic and

Conclusions

The results presented in this paper clearly show the very good outcomes arising form the use of the CWLIMG based on the concept of ‘flexible topological descriptors’. The average deviations are rather low and less than those arising from other similar methods to compute entropies. Besides, although the employment of higher order polynomial relationships gives better results than those arising from the usual linear equations, it is not necessary to resort to them in order to obtain sensible

Acknowledgements

Authors thank very much the useful referee's suggestions and comments, which have been helpful to improve the final version of the paper.

References (40)

  • G Krenkel et al.

    J. Mol. Struct. Theochem.

    (2001)
  • A Mercader et al.

    Chem. Phys. Lett.

    (2000)
  • M Randic

    Chem. Intel. Lab. Syst.

    (1991)
  • M Randic et al.

    J. Mol. Struct.

    (1993)
  • D Amic et al.

    J. Chem. Inf. Comp. Sci.

    (1998)
  • Basak, S.C., Harris, D.K., Magnuson, V.R., 2001. POLLY (version 2.3), Copyright of the University of...
  • B Bogdanov et al.

    J. Math. Chem.

    (1988)
  • B Bogdanov et al.

    J. Math. Chem.

    (1990)
  • E Estrada

    J. Chem. Inf. Comp. Sci.

    (1995)
  • C Hansch et al.

    J. Am. Chem. Soc.

    (1964)
  • H Hotelling

    J. Educ. Psychol.

    (1933)
  • A.R Katritzky et al.

    Chem. Soc. Rev.

    (1995)
  • L.B Kier et al.

    Molecular Connectivity in Chemistry and Drug Research

    (1976)
  • G Krenkel et al.

    Int. J. Mol. Sci.

    (2001)
  • E.R Malinowski

    Factor Analysis in Chemistry

    (1991)
  • A Mercader et al.

    J. Mol. Model.

    (2001)
  • A.N Pankratov

    Afinidad

    (1999)
  • M Randic

    Int. J. Quantum Chem. Quantum Biol. Symp.

    (1984)
  • M Randic

    New J. Chem.

    (1991)
  • M Randic

    J. Chem. Inf. Comput. Sci.

    (1991)
  • Cited by (23)

    • Prediction of alkane enthalpies by means of correlation weighting of Morgan extended connectivity in molecular graphs

      2004, Chemical Physics Letters
      Citation Excerpt :

      Under such circumstances the following steps can be carried out. Firstly, using the Monte Carlo method [8–32] the values of the CWs can be calculated that produce a maximum value of the correlation coefficient between the values of property/activity of interest and the values of the descriptors of Eq. (1) or (2) on the training set. An algorithm to search for optimum values of the CWs follows:

    • QSPR modeling of heat of formation and heat of vaporization of aliphatic ketones by means of electrotopological indices

      2003, Chemical Physics Letters
      Citation Excerpt :

      Notwithstanding, such studies should be important, especially as the enthalpy and entropy features of isomerization transitions determined from equilibrium data are usually more accurate that those derived from calorimetric measurements [3]. Besides, they could be employed to compare results of independent experimental methods and verifying semiempirical molecular orbital and other alternative calculations procedures for predicting and fitting thermodynamic data [4–11]. On the other hand, the reliability of some calorimetric data on the heats of formation and vaporization of aliphatic ketones sometimes is questionable and in addition the experimental determination of these properties for relatively large molecules of this sort are rather problematic since the exposure of them to thermal processes may lead to decomposition phenomena.

    View all citing articles on Scopus
    View full text