skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: General Protocol for the Accurate Prediction of Molecular 13C/1H NMR Chemical Shifts via Machine Learning Augmented DFT

Journal Article · · Journal of Chemical Information and Modeling
ORCiD logo [1];  [2]; ORCiD logo [3];  [4]; ORCiD logo [2]
  1. Univ. of Wollongong, NSW (Australia)
  2. Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
  3. Nankai Univ., Tianjin (China). State Key Lab. of Elemento-Organic Chemistry
  4. Guandong Lab., Guangzhou (China). Center of Chemistry and Chemical Biology

An accurate prediction of NMR chemical shifts at affordable computational cost is very important for different types of structural assignments in experimental studies. Density functional theory (DFT) and gauge-including atomic orbital (GIAO) are two of the most popular computational methods for NMR calculation, yet they often fail to resolve ambiguities in structural assignments. In this work, we present a new method that uses machine learning (ML) techniques (DFT + ML) that significantly increases the accuracy of 13C/1H NMR chemical shift prediction for a variety of organic molecules. The input of the generalizable DFT + ML model contains two critical parts: one is a vector providing insights into chemical environments, which can be evaluated without knowing the exact geometry of the molecule; the other one is the DFT-calculated isotropic shielding constant. The DFT + ML model was trained with a data set containing 476 13C and 270 1H experimental chemical shifts. For the DFT methods used here, the root mean square deviations (RMSDs) for the errors between predicted and experimental 13C/1H chemical shifts can be as small as 2.10/0.18 ppm, which is much lower than those from simple DFT (5.54/0.25 ppm), or DFT + linear regression (LR) (4.77/0.23 ppm) approaches. It also has a smaller maximum absolute error than two previously proposed NMR-predicting ML models. The robustness of the DFT + ML model is tested on two classes of organic molecules (TIC10 and hyacinthacines), where the correct isomers were unambiguously assigned to the experimental ones. Overall, the DFT + ML model shows promise for structural assignments in a variety of systems, including stereoisomers, that are often challenging to determine experimentally.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Basic Energy Sciences (BES); National Natural Science Foundation of China (NSFC); Fundamental Research Funds of the Central Universities; Natural Science Foundation of Tianjin City
Grant/Contract Number:
AC05-76RL01830; 72353; 21890722; 21702109; 11811530637; 18JCYBJC2140; 63191515; 63196021; 631915230
OSTI ID:
1668320
Report Number(s):
PNNL-SA-143566
Journal Information:
Journal of Chemical Information and Modeling, Vol. 60, Issue 8; ISSN 1549-9596
Publisher:
American Chemical SocietyCopyright Statement
Country of Publication:
United States
Language:
English

References (52)

Computational Chemistry to the Rescue: Modern Toolboxes for the Assignment of Complex Molecules by GIAO NMR Calculations journal July 2016
Computational Prediction of 1 H and 13 C Chemical Shifts: A Useful Tool for Natural Product, Mechanistic, and Synthetic Organic Chemistry journal November 2011
Molecular Orbital Theory of Magnetic Shielding and Magnetic Susceptibility journal June 1972
An automated framework for NMR chemical shift calculations of small organic molecules journal October 2018
Ab Initio Methods for the Calculation of NMR Shielding and Indirect Spin−Spin Coupling Constants journal January 1999
Benchmarking density-functional theory calculations of NMR shielding constants and spin–rotation constants using accurate coupled-cluster calculations journal January 2013
Toward the Complete Prediction of the1H and13C NMR Spectra of Complex Organic Molecules by DFT Methods: Application to Natural Substances journal July 2006
Scaling factors for carbon NMR chemical shifts obtained from DFT B3LYP calculations journal January 2009
Theoretical investigation on H1 and C13 NMR chemical shifts of small alkanes and chloroalkanes journal October 2006
Regression Formulas for Density Functional Theory Calculated 1 H and 13 C NMR Chemical Shifts in Toluene- d 8 journal November 2011
Development of a 13 C NMR Chemical Shift Prediction Procedure Using B3LYP/cc-pVDZ and Empirically Derived Systematic Error Correction Terms: A Computational Small Molecule Structure Elucidation Method journal April 2017
Towards an Accurate Prediction of Nitrogen Chemical Shifts by Density Functional Theory and Gauge‐Including Atomic Orbital journal November 2018
11 B NMR Chemical Shift Predictions via Density Functional Theory and Gauge-Including Atomic Orbital Approach: Applications to Structural Elucidations of Boron-Containing Molecules journal July 2019
ACS Central Science Virtual Issue on Machine Learning journal August 2018
Learning More, with Less journal April 2017
Perspective: Machine learning potentials for atomistic simulations journal November 2016
First Principles Neural Network Potentials for Reactive Simulations of Large Molecular and Condensed Systems journal August 2017
Machine Learning Force Fields: Construction, Validation, and Outlook journal December 2016
Machine Learning of Coarse-Grained Molecular Dynamics Force Fields journal April 2019
Machine learning enhanced global optimization by clustering local environments to enable bundled atomic energies journal October 2018
Global minimization of gold clusters by combining neural network potentials and the basin-hopping method journal January 2015
Accelerating atomic structure search with cluster regularization journal June 2018
Disentangling Structural Confusion through Machine Learning: Structure Prediction and Polymorphism of Equiatomic Ternary Phases ABC journal November 2017
Chemical Pressure-Driven Enhancement of the Hydrogen Evolving Activity of Ni 2 P from Nonmetal Surface Doping Interpreted via Machine Learning journal March 2018
Machine Learning Directed Search for Ultraincompressible, Superhard Materials journal July 2018
Machine-Learning Prediction of CO Adsorption in Thiolated, Ag-Alloyed Au Nanoclusters journal November 2018
Accelerated Discovery of Organic Polymer Photocatalysts for Hydrogen Evolution from Water through the Integration of Experiment and Theory journal May 2019
Prediction of Organic Reaction Outcomes Using Machine Learning journal April 2017
Transferable Machine-Learning Model of the Electron Density journal December 2018
Deoxyfluorination with Sulfonyl Fluorides: Navigating Reaction Space with Machine Learning journal March 2018
Machine Learning for Quantum Mechanical Properties of Atoms in Molecules journal July 2015
Toward More Reliable 13 C and 1 H Chemical Shift Prediction:  A Systematic Comparison of Neural-Network and Least-Squares Regression Based Approaches journal December 2007
Prediction of 1 H NMR Chemical Shifts Using Neural Networks journal January 2002
Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction journal January 2008
SHIFTX2: significantly improved protein chemical shift prediction journal March 2011
SPARTA+: a modest improvement in empirical NMR chemical shift prediction by means of an artificial neural network journal July 2010
Ab Initio Quality NMR Parameters in Solid-State Materials Using a High-Dimensional Neural-Network Representation journal January 2016
Chemical shifts in molecular solids by machine learning journal October 2018
Comparison Study on the Prediction of Multiple Molecular Properties by Various Neural Networks journal October 2018
Iterative partial equalization of orbital electronegativity—a rapid access to atomic charges journal January 1980
Prediction of Physicochemical Parameters by Atomic Contributions journal August 1999
A widely applicable set of descriptors journal January 2000
A Comparison of Density Functional Methods for the Estimation of Proton Chemical Shifts with Chemical Accuracy journal September 1999
SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules journal February 1988
Universal Solvation Model Based on Solute Electron Density and on a Continuum Model of the Solvent Defined by the Bulk Dielectric Constant and Atomic Surface Tensions
  • Marenich, Aleksandr V.; Cramer, Christopher J.; Truhlar, Donald G.
  • The Journal of Physical Chemistry B, Vol. 113, Issue 18, p. 6378-6396 https://doi.org/10.1021/jp810292n
journal May 2009
Use of Solution-Phase Vibrational Frequencies in Continuum Models for the Free Energy of Solvation journal December 2011
Less is more: Sampling chemical space with active learning journal June 2018
Graphene-Based Membranes for Molecular Separation journal July 2015
Liquid fuels, hydrogen and chemicals from lignin A critical review journal May 2013
Pharmacophore Reassignment for Induction of the Immunosurveillance Cytokine TRAIL journal May 2014
New polyhydroxylated pyrrolizidine alkaloids from Muscari armeniacum: structural determination and biological activity journal January 2000
On the effect of intramolecular H-bonding in the configurational assessment of polyhydroxylated compounds with computational methods. The hyacinthacines case journal February 2019