Database Note
Development of KiBank, a database supporting structure-based drug design

https://doi.org/10.1016/j.compbiolchem.2004.09.003Get rights and content

Abstract

KiBank is a database of inhibition constant (Ki) values with 3D structures of target proteins and chemicals. Ki values were accumulated from peer-reviewed literature searched via PubMed. The 3D structure files of target proteins were originally from Protein Data Bank (PDB), while the 2D structure files of the chemicals were collected together with the Ki values and then converted into 3D ones. In KiBank, the chemical and protein 3D structures with hydrogen atoms were optimized by energy minimization and stored in MDL MOL and PDB format, respectively.

KiBank is designed to support structure-based drug design. It provides structure files of proteins and chemicals ready for use in virtual screening through automated docking methods, while the Ki values can be applied for tests of docking/scoring combinations, program parameter settings, and calibration of empirical scoring functions. Additionally, the chemical structures and corresponding Ki values in KiBank are useful for lead optimization based on quantitative structure–activity relationship (QSAR) techniques.

KiBank is updated on a daily basis and is freely available at http://kibank.iis.u-tokyo.ac.jp/. As of August 2004, KiBank contains 8000 Ki values, over 6000 chemicals and 166 proteins covering the subtypes of receptors and enzymes.

Introduction

Since last decade, structure-based drug design (SBDD) has become a mature discipline of medicinal chemistry (Anderson, 2003, Böhm, 1996, Klebe, 2000, Nakata, 2002), and development of computer technologies to calculate molecular properties, of combinatorial chemistry and abundant data on target proteins coming from human genome research have opened new opportunities and feasible approaches for drug discovery (Bailey and Brown, 2001, Kirkpatrick et al., 1999).

Estimation of the binding affinity of novel chemicals to target proteins is a critical procedure in computational approaches to drug design including SBDD. The strategies that can be applied for this purpose fall in to two major categories: a target-based approach and a ligand-based approach. Recently, some researchers have combined both of these approaches in an automated unbiased procedure (Dean et al., 2004, Loew et al., 1993, Sippl, 2002b). The former can be used if the 3D structure of the binding site is known as is the case of SBDD. In practice, in silico screening of chemical databases is widely applied to find lead candidates for target proteins. Each of the reported methods has two steps, docking and scoring (Ewing et al., 2001, Goodsell et al., 1996, Jones et al., 1997, Rarey et al., 1996). Although several scoring methods for estimating binding affinity have been documented, it is not yet clear which docking/scoring combinations will provide the best accuracy. Therefore, before beginning to screen the entire chemical database, it is necessary to test docking/scoring combinations and program parameter settings by a test screening of a reduced database including known ligands. Experimental binding affinity values are also needed for the calibration of most scoring functions (Bissantz et al., 2000; Schneider and Böhm, 2002). On the other hand, traditional quantitative structure–activity relationship (QSAR) and modern 3D QSAR techniques are widely used in the ligand-based approach when the target structure is unknown (Akamatsu, 2002, Loew et al., 1993, Yoo et al., 2000). In the past few years, QSAR techniques have also been used in combination with structure-based methods (Lozano et al., 2000, Sippl, 2002b, Vaz et al., 1998). The QSAR techniques are based on experimental structure–activity relationships, and thus require large amount of experimental data.

Generally, to perform SBDD, 3D structures of target proteins are definitely needed, while experimental data is indispensable for accurate estimation of the binding affinities of newly designed chemicals. Although a database including such data will exactly facilitate this drug discovery approach, to our knowledge, it has been nonexistent up to now. Therefore, we developed KiBank providing Ki values and 3D structure files of chemicals and proteins ready for use in SBDD.

Section snippets

Creating KiBank

KiBank is a PostgreSQL database consisting of three knowledge areas — binding affinity data, chemical data and protein data (Aizawa et al., 2004) (Fig. 1).

As the binding affinity data, inhibition constant (Ki) values were accumulated from peer-reviewed literature searched via PubMed by using the name of a target protein (e.g., androgen receptor) and “Ki” as search terms. Articles published from 1985, with the majority from around the year 2000, were then selected for input into KiBank through

Results and discussion

As of August 2004, KiBank contains 166 proteins covering the subtypes of receptors and enzymes, over 6000 chemicals and 8000 Ki values.

A drug is effective when it binds more specifically and tightly to the target protein against natural ligands in a competitive fashion (McIlwain, 1986). Thus consideration of chemicals’ binding competitiveness is important for the computational approaches to drug design. Experimental binding affinities are reported as the inhibition constant (Ki), relative

Acknowledgements

This research was done under the “Frontier Simulation Software for Industrial Science (FSIS)” project supported by the IT program of the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT), and was partly supported by the Toxico-proteomics project fund from the Japanese Ministry of Health, Labour and Welfare.

References (43)

  • S.E. Yoo et al.

    The conformation and activity relationship of benzofuran type of angiotensin II receptor antagonists

    Bioorg. Med. Chem.

    (2000)
  • M. Aizawa et al.

    KiBank: a database for computer-aided drug design based on protein–chemical interaction analysis

    Yakugaku Zasshi

    (2004)
  • M. Akamatsu

    Current state and perspectives of 3D-QSAR

    Curr. Top. Med. Chem.

    (2002)
  • Alexander, S.P.H., Mathie, A., Peters, J.A., 2001. TiPS Nomenclature Supplement, 12th ed., Elsevier Current Trends,...
  • M. Andrec et al.

    Complete protein structure determination using backbone residual dipolar couplings and side chain rotamer prediction

    J. Struct. Funct. Genom.

    (2002)
  • D.A. Benson et al.

    GenBank

    Nucl. Acids Res.

    (2003)
  • H.M. Berman et al.

    The protein data bank

    Nucl. Acids Res.

    (2000)
  • C. Bissantz et al.

    Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations

    J. Med. Chem.

    (2000)
  • B. Boeckmann et al.

    The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003

    (2003)
  • X. Chen et al.

    TTD: therapeutic target database

    Nucl. Acids Res.

    (2002)
  • A.J. Cuticchia

    Future vision of the GDB human genome database

    Human Mutat.

    (2000)
  • Cited by (46)

    • Concepts and Experimental Protocols of Modelling and Informatics in Drug Design

      2020, Concepts and Experimental Protocols of Modelling and Informatics in Drug Design
    • Chapter 11 SAR Knowledge Bases in Drug Discovery

      2008, Annual Reports in Computational Chemistry
      Citation Excerpt :

      The data comprise those generated within the host laboratory at the University of North Carolina (sponsored by the National Institute of Mental Health) together with data extracted from the literature. The inhibition constant (Ki) values in KiBank [27,28] (http://kibank.iis.u-tokyo.ac.jp) have been extracted from scientific journals (from 1985 onwards) via PubMed searches. KiBank was originally constructed with a structural emphasis and it does include information on 3D protein structure where applicable.

    • Machine learning approaches for predicting compounds that interact with therapeutic and ADMET related proteins

      2007, Journal of Pharmaceutical Sciences
      Citation Excerpt :

      Mining of the compounds known to have a particular property and those do not have that property from the literature117 and other sources118, 119 is a key to more extensive exploration of ML methods. Databases such as PDSP Ki database,120 KiBank,121 PubChem,122 and CLiBE123 that provide compound property and activity data are useful resources for serving this purpose, and more such databases are desired. In the datasets of some of the reported studies, there appears to be a significant imbalance between the numbers of compounds interacting with a therapeutic or ADMET related protein and those not interacting with that protein.

    • MLDB: Macromolecule ligand database

      2010, Journal of Applied Crystallography
    View all citing articles on Scopus
    View full text