Abstract:
Antimicrobial peptides (AMPs) are innate immune molecules that exhibit activities against a range of microbes. According to their special functions, AMPs are generally cl...View moreMetadata
Abstract:
Antimicrobial peptides (AMPs) are innate immune molecules that exhibit activities against a range of microbes. According to their special functions, AMPs are generally classified into several categories. Over the last decade, a number of AMP prediction tools have been designed and made freely available online, which show potential to discriminate AMPs from non-AMPs. However, the relative quality of existing AMP predictions produced by various tools is difficult to quantify. In fact, a comprehensive benchmark dataset used to train the prediction model is one of key points to solving the problem. Also, how to address the multi-label character of new synthetic instance is obviously very important to both basic research and drug development. In view of this, AMPs prediction should be a task of two-level multi-label classification, in which the first step is to identify whether a query peptide is AMP, and the second step is to identify which functional type(s) the peptide belongs to. To establish a really useful prediction method, we construct a valid benchmark dataset to train the predictor, and develop a powerful algorithm to operate the prediction. In this paper, we propose a novel two-layer prediction model for identifying AMP and its functional types, using ADASYN oversampling technology to solve imbalance multi-label classification problem. First, we construct a novel benchmark AMPs dataset with seven different AMP functional types. Then, we encode AMPs within three different feature representations, and use various feature extraction models to convert the variable length coding matrix into some equidimensional features. Furthermore, we use modelbased feature selection method for filtering effective and sparse features. Finally, we apply ensemble classifier chain model to identify whether a query peptide is an AMPs or non-AMPs. In the second layer prediction, we use ADASYN to oversample different functional types of AMPs, and build a multi-label multi-class predict...
Date of Conference: 16-19 December 2020
Date Added to IEEE Xplore: 13 January 2021
ISBN Information: