research-article

ToxinMI: improving peptide toxicity prediction by fusing multimodal information based on mutual information

Authors:
Lesong Wei

University of Tsukuba, Japan

University of Tsukuba, Japan
View Profile

,
Xiucai Ye

University of Tsukuba, Japan

University of Tsukuba, Japan
View Profile

,
Tetsuya Sakurai

University of Tsukuba, Japan

University of Tsukuba, Japan
View Profile

RACS '22: Proceedings of the Conference on Research in Adaptive and Convergent SystemsOctober 2022Pages 77–82https://doi.org/10.1145/3538641.3561492

Published:20 October 2022Publication History

RACS '22: Proceedings of the Conference on Research in Adaptive and Convergent Systems

Pages 77–82

ABSTRACT

Accurately identifying peptide toxicity is a crucial step for computer-aided peptide-based drug screening, which could accelerate novel drug discovery and reduce resource consumption. Recently, deep learning has shown promising performance in bioinformatics. However, one challenge in developing a deep learning-based model for peptide toxicity prediction is how to represent peptides effectively. In this study, we propose an end-to-end deep learning model named ToxinMI, to predict peptide toxicity that learns features directly from sequence alone. Precisely, ToxinMI captures the sequential and evolutionary features of the peptide simultaneously and introduces the mutual information principle to learn a discriminative representation by discarding noisy information and retaining related-task information from them as much as possible. The experimental results demonstrate that ToxinMI achieves superior predictive performance against state-of-the-art baselines.¹

References

Deb, P. K., Al-Attraqchi, O., Chandrasekaran, B., Paradkar, A. and Tekade, R. K. Protein/peptide drug delivery systems: practical considerations in pharmaceutical product development. Elsevier, City, 2019.Google ScholarCross Ref
Fosgerau, K. and Hoffmann, T. Peptide therapeutics: current status and future directions. Drug discovery today, 20, 1 (2015), 122--128.Google Scholar
Liu, X., Wu, F., Ji, Y. and Yin, L. Recent advances in anti-cancer protein/peptide delivery. Bioconjugate chemistry, 30, 2 (2018), 305--324.Google Scholar
Muttenthaler, M., King, G. F., Adams, D. J. and Alewood, P. F. Trends in peptide drug discovery. Nature reviews Drug discovery, 20, 4 (2021), 309--325.Google Scholar
Otvos Jr, L. and Wade, J. D. Current challenges in peptide-based drug discovery. Frontiers Media SA, City, 2014.Google ScholarCross Ref
Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D. J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research, 25, 17 (1997), 3389--3402.Google Scholar
Negi, S. S., Schein, C. H., Ladics, G. S., Mirsky, H., Chang, P., Rascle, J.-B., Kough, J., Sterck, L., Papineni, S. and Jez, J. M. Functional classification of protein toxins as a basis for bioinformatic screening. Scientific reports, 7, 1 (2017), 1--11.Google Scholar
Cole, T. J. and Brewer, M. S. TOXIFY: a deep learning approach to classify animal venom proteins. PeerJ, 7 (2019), e7200.Google ScholarCross Ref
Ding, C. H. and Dubchak, I. Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics, 17, 4 (2001), 349--358.Google ScholarCross Ref
Wei, L., Zhou, C., Chen, H., Song, J. and Su, R. ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics, 34, 23 (2018), 4007--4016.Google ScholarCross Ref
Bebis, G. and Georgiopoulos, M. Feed-forward neural networks. IEEE Potentials, 13, 4 (1994), 27--31.Google ScholarCross Ref
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q. and Liu, T.-Y. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30 (2017), 3146--3154.Google Scholar
Byvatov, E. and Schneider, G. Support vector machine applications in bioinformatics. Applied bioinformatics, 2, 2 (2003), 67--77.Google Scholar
Kuang, Q., Li, Y., Wu, Y., Li, R., Dong, Y., Li, Y., Xiong, Q., Huang, Z. and Li, M. A kernel matrix dimension reduction method for predicting drug-target interaction. Chemometrics and Intelligent Laboratory Systems, 162 (2017), 104--110.Google ScholarCross Ref
Naamati, G., Askenazi, M. and Linial, M. ClanTox: a classifier of short animal toxins. Nucleic acids research, 37, suppl_2 (2009), W363--W368.Google Scholar
Gacesa, R., Barlow, D. J. and Long, P. F. Machine learning can differentiate venom toxins from other proteins having non-toxic physiological functions. PeerJ Computer Science, 2 (2016), e90.Google ScholarCross Ref
Gupta, S., Kapoor, P., Chaudhary, K., Gautam, A., Kumar, R., Consortium, O. S. D. D. and Raghava, G. P. In silico approach for predicting toxicity of peptides and proteins. PloS one, 8, 9 (2013), e73957.Google Scholar
Sharma, N., Naorem, L. D., Jain, S. and Raghava, G. P. ToxinPred2: an improved method for predicting toxicity of proteins. Briefings in Bioinformatics (2022).Google Scholar
He, Y., Maisuradze, G. G., Yin, Y., Kachlishvili, K., Rackovsky, S. and Scheraga, H. A. Sequence-, structure-, and dynamics-based comparisons of structurally homologous CheY-like proteins. Proceedings of the National Academy of Sciences, 114, 7 (2017), 1578--1583.Google ScholarCross Ref
Papadatos, G., Gaulton, A., Hersey, A. and Overington, J. P. Activity, assay and target data curation and quality in the ChEMBL database. Journal of computer-aided molecular design, 29, 9 (2015), 885--896.Google Scholar
Chen, H., Engkvist, O., Wang, Y., Olivecrona, M. and Blaschke, T. The rise of deep learning in drug discovery. Drug discovery today, 23, 6 (2018), 1241--1250.Google Scholar
Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R. and Muharemagic, E. Deep learning applications and challenges in big data analytics. Journal of big data, 2, 1 (2015), 1--21.Google Scholar
Pan, X., Zuallaert, J., Wang, X., Shen, H.-B., Campos, E. P., Marushchak, D. O. and De Neve, W. ToxDL: deep learning using primary structure and domain embeddings for assessing protein toxicity. Bioinformatics, 36, 21 (2021), 5159--5168.Google ScholarCross Ref
Wei, L., Ye, X., Xue, Y., Sakurai, T. and Wei, L. ATSE: a peptide toxicity predictor by exploiting structural and evolutionary information based on graph neural network and attention mechanism. Briefings in Bioinformatics, 22, 5 (2021), bbab041.Google ScholarCross Ref
Wei, L., Ye, X., Sakurai, T., Mu, Z. and Wei, L. ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning. Bioinformatics, 38, 6 (2022), 1514--1524.Google ScholarCross Ref
Dong, Q.-W., Wang, X.-l. and Lin, L. Application of latent semantic analysis to protein remote homology detection. Bioinformatics, 22, 3 (2006), 285--290.Google ScholarDigital Library
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł. and Polosukhin, I. Attention is all you need. Advances in neural information processing systems, 30 (2017).Google Scholar
Alemi, A. A., Fischer, I., Dillon, J. V. and Murphy, K. Deep variational information bottleneck. arXiv preprint arXiv:1612.00410 (2016).Google Scholar
Wei, L., Ye, X., Xue, Y., Sakurai, T. and Wei, L. ATSE: a peptide toxicity predictor by exploiting structural and evolutionary information based on graph neural network and attention mechanism. Briefings in Bioinformatics (2021).Google Scholar

Index Terms

ToxinMI: improving peptide toxicity prediction by fusing multimodal information based on mutual information
1. Applied computing
  1. Life and medical sciences
    1. Bioinformatics
2. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Protein CorreLogo: an X3D representation of co-evolving pairs, tertiary structure, ligand binding pockets and protein-protein interactions in protein families
Web3D '07: Proceedings of the twelfth international conference on 3D web technology

To understand the functional elements of a protein structure biologists use domain specific 3D viewers (PDB) that are written to process the coordinates of atoms that represent the solved protein structure using X-Ray crystallography or NMR. The PDB ...
Read More
A novel empirical mutual information approach to identify co-evolving amino acid positions of influenza A viruses

Mutual information (MI) is an approach commonly used to estimate the evolutionary correlation of 2 amino acid sites. Although several MI methods exist, prior to our contribution no systematic method had been developed to assess their performance, or to ...
Read More
GSAML-DTA: An interpretable drug-target binding affinity prediction model based on graph neural networks with self-attention mechanism and mutual information
Abstract
Identifying drug-target affinity (DTA) has great practical importance in the process of designing efficacious drugs for known diseases. Recently, numerous deep learning-based computational methods have been developed to predict drug-target ...
Highlights
- We develop GSAML-DTA, an interpretable deep learning framework for DTA prediction.
- GSAML-DTA integrates a self-attention mechanism and graph neural networks (GNNs) to build representations of drugs and target proteins from the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
RACS '22: Proceedings of the Conference on Research in Adaptive and Convergent Systems
October 2022
208 pages
ISBN:9781450393980
DOI:10.1145/3538641
Conference Chair:
Peng Li
The University of Aizu, Japan
,
Program Chairs:
Junyoung Heo
Hansung University, Korea
,
Tomas Cerny
Baylor University
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 October 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
multi-modal information
mutual information
peptide toxicity prediction
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate393of1,581submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 87
  Total Downloads
- Downloads (Last 12 months)53
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

ToxinMI: improving peptide toxicity prediction by fusing multimodal information based on mutual information

RACS '22: Proceedings of the Conference on Research in Adaptive and Convergent Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Protein CorreLogo: an X3D representation of co-evolving pairs, tertiary structure, ligand binding pockets and protein-protein interactions in protein families

A novel empirical mutual information approach to identify co-evolving amino acid positions of influenza A viruses

GSAML-DTA: An interpretable drug-target binding affinity prediction model based on graph neural networks with self-attention mechanism and mutual information