In silico methods and tools for drug discovery
Introduction
Conventional drug discovery and development are risky, time-consuming processes that include target identification and validation, lead compound discovery and optimization, and preclinical and clinical trials [1]. In recent years, the estimated cost of bringing a new drug to market has reached about $1.8 billion USD [2], and the attrition rate of drug candidates is as high as 96% [2]. The reasons underlying this high attrition rate are poor drug efficacy and deficient drug absorption, distribution, metabolism, and excretion, and toxicity (ADME-Tox) [3]. Typically, in vivo and in vitro techniques are employed to examine drug safety, including adverse effects and toxicity. Recent advancements in in vitro models, such as organ-on-chip technology, have accelerated ADME-Tox assessments [4]. However, these approaches remain time-consuming, labor-intensive, and costly. High-throughput screening (HTS) methods have been developed to accelerate the identification of pharmacologically active chemical compounds from large numbers of molecules using automated assays [5]. Although automatic HTS systems reduce the need for human intervention, the scale of HTS remains low compared to the diversity of chemical structures. In addition, automated instruments remain expensive.
Recently, computer-aided drug discovery (CADD) approaches are attracting increasing attention as they can help mitigate the scale, time, and cost issues faced by conventional experimental approaches. CADD includes computational identification of potential drug targets, virtual screening of large chemical libraries for effective drug candidates, further optimization of candidate compounds, and in silico assessment of their potential toxicity. After these processes are conducted computationally, candidate compounds are subjected to in vitro/in vivo experiments for confirmation. Thus, CADD approaches can reduce the number of chemical compounds that must be evaluated experimentally while increasing the success rate by removing inefficient and toxic chemical compounds from consideration [6]. To date, CADD has been successfully employed to bring new drug compounds to market for diverse diseases, including human immunodeficiency virus (HIV)-1-inhibiting drugs (atazanavir [7], saquinavir [8], indinavir [9], and ritonavir [10]), anti-cancer drugs (raltitrexed [11]), and antibiotics (norfloxacin [12]).
Several CADD approaches have been developed and integrated with machine learning techniques to improve the accuracy and efficiency of CADD methods [13]. Structure-based drug discovery (SBDD) [14] and ligand-based drug discovery (LBDD) [15] are two different approaches taken in CADD. The selection of a suitable CADD approach relies on the availability of target protein structural information. To use the SBDD approach, structural information on the target protein is required, which is usually obtained experimentally by nuclear magnetic resonance or X-ray crystallography [14]. When neither is available, in silico prediction methods such as homology modeling [16] or ab initio modeling [17] can be used to predict the 3D structure of the target protein. Once the structure is available, structure-based virtual screening and molecular docking are possible [18]. When the structure is not available and it is not possible to predict a high-quality structure using in silico methods, the LBDD approach is often taken as an alternative. Although this approach requires prior information on the known active molecules of the target protein, many compounds have been discovered to treat diseases and are compiled in public databases unless the target is novel [[19], [20], [21]]. These approaches are introduced in section 4.
The field of CADD is rapidly advancing, and techniques and methods are under active development. Over the past few years, the integration of biological big data and machine learning approaches has opened new possibilities to increase the accuracy and efficiency of in silico drug discovery. This review introduces the overall procedures and methodologies behind in silico drug discovery, including target protein identification, chemical library screening, and toxicity assessment using machine learning approaches, summarizes available prediction tools and databases, and lists Federal Drug Administration (FDA)-approved and reported drug compounds developed using CADD techniques.
Section snippets
Increase in biological data on chemical molecules for drug discovery
Over the past few decades, large-scale data has been generated on hundreds of thousands of small molecules through biological screening, and this data is compiled in online repositories that are available for research. For example, due to advancements in HTS techniques, large-scale experiments of >1 million chemicals have been generated [22]. In addition, this biological assay data has been compiled in chemical library databases, and the amount of data is increasing rapidly due to advancements
Target identification
A drug target is defined as a biological entity, usually a protein, that can modulate disease phenotypes [29]. Thus, the identification of prime drug targets is the first and most important step in drug discovery. Conventional drug target identification strategies are performed experimentally, such as identifying differentially expressed genes between normal and diseased cells or tissues and proteins that are highly interconnected with disease-related proteins.
In silico methods for drug screening
The goal of drug discovery is to find small molecules that can modulate the function of an identified target protein and thereby modulate the disease phenotype. Furthermore, it is necessary to identify small molecules that possess effective pharmacokinetic properties and low toxicity. Drug discovery involves a long, expensive, and risky cascade of complicated steps, including drug candidate identification, candidate validation, pharmacokinetics, and preclinical toxicity assessments. Traditional
ADME-Tox assessment
Once drug candidates are discovered, the next step is to assess their pharmacokinetic properties, such as ADME-Tox. Due to advances in machine learning algorithms and accumulated datasets, ADME-Tox can also be predicted using computational methods.
It is estimated that 40%–60% of drug candidates are withdrawn in preclinical tests because of ADME-Tox concerns [85]. Drug compounds must cross various physiological barriers, such as the gastrointestinal barrier, the blood-brain barrier, and
Successful applications of in silico drug design
The development of new therapeutic drugs is an expensive and time-consuming process. In silico technology has become essential in the contemporary pharmaceutical industry because it can reduce the time and resources required for drug discovery. Due to advancements in computational algorithms and accumulated knowledge databases, computational prediction tools have now been integrated into every stage of the drug discovery process. Computational drug discovery methods have been successfully used
Conclusions
Over the past few decades, the in silico identification of disease-associated drug targets and therapeutic drugs has become increasingly efficient and accurate. Recently, in silico drug discovery has accelerated due to rapid advancements in computational methods and accumulating publicly available biological data. Chemical biology is involved in the elucidation of the biological functions of targets, while CADD techniques make use of structural information of either the drug target
Author contributions
BS, SA, and DN: conceptualization. BS and JL: data curation. BS and CJ: methodology. DN: supervision. BS and DN: manuscript writing. All authors have read and agreed to the published version of the manuscript.
Declaration of competing interests
There are no conflicts of interest to declare.
A conflict of interest statement
None declared.
Acknowledgments
This work was supported by a National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (NRF-2019M3E5D4065682). This research was also supported by the Center for Women in Science, Engineering, and Technology grant funded by the Ministry of Science and ICT (MSIT) under the Program for Returners into R&D.
References (267)
- et al.
New technologies in computer-aided drug design: toward target identification and new chemical entity discovery, Drug Discov
Today Technol
(2006) - et al.
A low-cost, high-quality new drug discovery process using patient-derived induced pluripotent stem cells
Drug Discov. Today
(2015) - et al.
From 3D cell culture to organs-on-chips
Trends Cell Biol.
(2011) - et al.
A review of high throughput technology for the screening of natural products
Biomed. Pharmacother.
(2008) - et al.
Addressing toxicity risk when designing and selecting compounds in early drug discovery
Drug Discov. Today
(2014) - et al.
Crystal structure at 1.9-A resolution of human immunodeficiency virus (HIV) II protease complexed with L-735,524, an orally bioavailable inhibitor of the HIV proteases
J. Biol. Chem.
(1994) The process of structure-based drug design
Chem. Biol.
(2003)- et al.
Binding of the anticancer drug ZD1694 to E. coli thymidylate synthase: assessing specificity and affinity
Structure
(1996) - et al.
Homology modeling in drug discovery: current trends and applications
Drug Discov. Today
(2009) - et al.
Ab initio modeling of small proteins by iterative TASSER simulations
BMC Biol.
(2007)
New active leads for tuberculosis booster drugs by structure-based drug discovery
Org. Biomol. Chem.
A cell-based ultra-high-throughput screening assay for identifying inhibitors of D-amino acid oxidase
J. Biomol. Screen
Computational approaches in target identification and drug discovery
Comput. Struct. Biotechnol. J.
Target deconvolution from phenotype-based drug discovery by using chemical proteomics approaches
Biochim. Biophys. Acta Proteins Proteom.
Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics
Mol. Cell. Proteom.
Recent advances and method development for drug target identification
Trends Pharmacol. Sci.
In silico target fishing: predicting biological targets from chemical structure, Drug Discov
Today Technol
Machine-learning approaches in drug discovery: methods and applications, Drug Discov
Today
RNAi and siRNA in target validation
Drug Discov. Today
Managing the drug discovery/development interface
Drug Discov. Today
Role of the development scientist in compound lead selection and optimization
J. Pharm. Sci.
Aplicación de métodos computacionales para el descubrimiento, diseño y optimización de fármacos contra el cáncer
Bol. Méd. Hosp. Infan. Méx.
How to improve R&D productivity: the pharmaceutical industry's grand challenge
Nat. Rev. Drug Discov.
BMS-232632, a highly potent human immunodeficiency virus protease inhibitor that can be used in combination with other available antiretroviral agents
Antimicrob. Agents Chemother.
Novel binding mode of highly potent HIV-proteinase inhibitors incorporating the (R)-hydroxyethylamine isostere
J. Med. Chem.
ABT-538 is a potent inhibitor of human immunodeficiency virus protease and has high oral bioavailability in humans
Proc. Nat. Acad. Sci.
Applications of machine learning in drug discovery and development
Nat. Rev. Drug Discov.
Structure-based Drug Discovery
Ligand-based Approaches to in Silico pharmacology,Chemoinformatics and Computational Chemical Biology
Docking and scoring in virtual screening for drug discovery: methods and applications
Nat. Rev. Drug Discov.
Structure-based drug discovery using GPCR homology modeling: successful virtual screening for antagonists of the alpha1A adrenergic receptor
J. Med. Chem.
Successful applications of computer aided drug discovery: moving drugs from concept to the clinic
Curr. Topics Med. Chem.
QSAR in Drug Discovery, Drug Design: Structure-And Ligand-Based Approaches
A multiple kernel learning algorithm for drug-target interaction prediction
BMC Bioinf.
Developing enhanced blood–brain barrier permeability models: integrating external bio-assay data in QSAR modeling
Pharm. Res.
vNN web server for ADMET predictions
Front. Pharmacol.
Locally weighted learning methods for predicting dose-dependent toxicity with application to the human maximum recommended daily dose
Chem. Res. Toxicol.
Machine Learning in Drug Discovery
Genomic profiling of drug sensitivities via induced haploinsufficiency
Nat. Genet.
Identifying the proteins to which small-molecule probes and drugs bind in cells
Proc. Nat. Acad. Sci.
Computational prediction of drug–target interactions using chemogenomic approaches: an empirical survey, Brief
Bioinform
Computational-experimental approach to drug-target interaction mapping: a case study on kinase inhibitors
PLoS Comput. Biol.
Text mining for drug discovery
Method Mol. Biol.
Computational/in silico methods in drug target and lead prediction
Brief. Bioinform.
The Harmonizome: a Collection of Processed Datasets Gathered to Serve and Mine Knowledge about Genes and Proteins, Database 2016
Open Targets Platform: supporting systematic drug–target identification and prioritisation
Nucleic Acids Res.
In Silico Target Prediction for Small molecules,Systems Chemical Biology
Ligand–protein inverse docking and its potential use in the computer search of protein targets of a small molecule
Proteins
Recovering the true targets of specific ligands by virtual screening of the protein data bank
Proteins
TarFisDock: a web server for identifying drug targets with docking approach
Nucleic Acids Res.
Cited by (164)
Synthesis of 2-alkyl- and 2-arylthiazolo[5,4-c]isoquinolines and in silico prediction of their biological activities and toxicity
2024, Journal of Molecular StructureNew thiazolidine-2,4-diones as potential anticancer agents and apoptotic inducers targeting VEGFR-2 kinase: Design, synthesis, in silico and in vitro studies
2024, Biochimica et Biophysica Acta - General SubjectsFinding new analgesics: Computational pharmacology faces drug discovery challenges
2024, Biochemical PharmacologyTargeting XGHPRT enzyme to manage Helicobacter pylori induced gastric cancer: A multi-pronged machine learning, artificial intelligence and biophysics-based study
2024, Saudi Journal of Biological Sciences