Elsevier

Knowledge-Based Systems

Volume 191, 5 March 2020, 105261
Knowledge-Based Systems

Predicting lncRNA–miRNA interactions based on logistic matrix factorization with neighborhood regularized

https://doi.org/10.1016/j.knosys.2019.105261Get rights and content

Abstract

Long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) interactions play important roles in diagnostic biomarkers and therapeutic targets for various human diseases. However, experimental methods for finding miRNAs associated with a particular lncRNA are costly, time consuming, and only a few theoretical approaches play a role in predicting potential lncRNA–miRNA associations. In this study, we have established a novel matrix factorization model to predict lncRNA–miRNA interactions, namely lncRNA–miRNA interactions prediction by logistic matrix factorization with neighborhood regularized (LMFNRLMI). Meanwhile, it only utilizes known positive samples to mine potential associations in data that lack negative samples. As a result, this new model obtains reliable performance in the leave-one-out cross validation (the AUC of 0.9319) and 5-fold cross validation (the AUC of 0.9220), which has significantly improved performance in predicting potential lncRNA–miRNA associations compared to other models. Furthermore, comparison with several other network algorithms, and test based on all kinds of similarity, our model successfully confirms the superiority of LMFNRLMI. Whereby, we hope that LMFNRLMI can be a useful tool for potential lncRNA–miRNA association identification in the future.

Introduction

With the rapid development of sequencing technology, a variety of non-coding RNA (ncRNA) data sets [1], [2], [3] have been gradually collected to solve genomic and biological problems. Meanwhile, long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) have received widespread attention during recent years [4], [5]. LncRNAs are a class of ncRNAs with length of more than 200 nucleotides, which have been confirmed to play key roles in complex cellular processes such as epigenetic regulation of gene expression [6], chromatin modification [7] and cell differentiation [8]. In contrast, miRNAs are short regulatory RNAs of approximately 20 nucleotides in length with no protein-coding ability. MiRNAs are also widely involved in a variety of physiological and pathological processes such as development [9], cell proliferation [10], apoptosis and tumorigenesis [11].

A number of studies have reported the effects of miRNAs on lncRNA function [12], [13]. For example, lncRNA–miRNA regulatory networks in gastric cancer [5], colon cancer [9], prostate cancer [14], and vascular disease [15] have been studied. A model for the interaction between lncRNAs and miRNAs has been established and many algorithms can be applied.

The development of new biomarker discovery and therapeutic approaches has been facilitated by a comprehensive and in-depth understanding of the relationship between lncRNA–miRNA and its role in pathophysiology. However, the interaction between lncRNA–miRNAs is still too limited by biological experiments to find the association of lncRNA–miRNAs in large amounts of data. In fact, a quantity of computational methods have been successfully used in assisted biological experiments and are widely used in bioinformatics, such as disease-related relationship prediction [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], drug-target interaction prediction [26] and lncRNA–protein associations prediction [27], [28], [29], [30], [31], [32]. Still, only a few models can be used to forecast lncRNA–miRNA associations. Until recently, Huang et al. [33] first proposed a graph-based method for large-scale prediction of lncRNA–miRNA interactions, which is based on the expression profile similarity to predict potential lncRNA and miRNA associations, ignoring the effects of sequences as essential features of RNA. Using experimental validated data based on sequence similarity, we propose a network-based lncRNA–miRNA interaction prediction model called logistic matrix factorization with neighborhood regularized (LMFNRLMI). Computational models can make discovery more efficient than the complex processes and high costs of experimentation. This algorithm is based on the known interaction information and the similarity of the neighbors to represent the latent factor vector of the matrix factorization into a probability score through the logic function. The idea of k-nearest neighbors is incorporated that k-strongest similarities of the samples are considered and integrated into the regularization items to improve forecasting accuracy. The 5-fold cross validation and leave-one-out cross validation (LOOCV) are used to evaluate our model, with AUC values of 0.9220 and 0.9319, respectively. We evaluate the performance of LMFNRLMI by comparing with other six methods to prove that our method works best. In addition, we use the expression profile similarity and functional similarity of lncRNA and miRNA to verify the generalization ability of the model. The AUC values based on the expression profile similarity and functional similarity are 0.8591 and 0.9131, respectively. The result shows that LMFNRLMI is feasible and effective for lncRNA–miRNA prediction.

Section snippets

Logical matrix factorization

Matrix factorization algorithms have been developed in recommendation systems and have recently applied research on RNA and protein relationship prediction [31]. Here, the observation matrix is decomposed into two low-dimensional matrices U×r and V×r to find the inner product, where r is the number of hidden factors. The probabilistic method is used, that is, the probability of occurrence of this event is allocated according to the inner product of the latent factor vector between lncRNA and

Regularized by neighborhood

Although the LMF algorithm can exploit latent vectors to predict potential relationships globally, strong associations of lncRNA–lncRNA and miRNA–miRNA are not considered. Therefore, we propose to use the adjacency matrix of lncRNAs and miRNAs to further improve the prediction accuracy of predicting lncRNA–miRNA interactions by K-nearest neighbors. For example, for lncRNA li, we use N(li) to represent the nearest neighbor set to lncRNA li, which contains k1 lncRNAs. In the same way, we

Data set

For our investigation, the known lncRNA–miRNA associations are downloaded from lncRNASNP2 database (http://bioinfo.life.hust.edu.cn/lncRNASNP#!/) [37]. lncRNASNP2 integrates alldatabases which is powerful and can provides comprehensive information on SNPs and mutations in lncRNAs, including experimentally validated lncRNA–miRNA pairs, with the latest records reaching 10,597 interactions. 780 different types of lncRNA and 275 different types of miRNAs are included in the records of these

Conclusion

Since the mutual regulation of lncRNA and miRNA in biological mechanism deeply affects human diseases, and experimental verification has high cost and time-consuming limitations, it is an important direction that establishing a lncRNA–miRNA association prediction model in current research. So far, information about lncRNA and miRNA interactions is still limited, but the number of experimental validations is gradually increasing. Relying on reliable experimental data, we build a model called

Acknowledgments

This work was supported by the Doctor Startup Foundation from Liaoning province, China under Grant No. 20170520217, Important Scientific and Technical Achievements Transformation Project, China under Grant No. Z17-5-078, Large-scale Equipment Shared Services Project, China under Grant No. F15165400 and Applied Basic Research Project, China under Grant No. F16205151.

References (41)

  • ResC.

    Correction: Long noncoding RNA HOTAIR regulates polycomb-dependent chromatin modification and is associated with poor prognosis in colorectal Cancers

    Cancer Res.

    (2011)
  • WangP. et al.

    The STAT3-binding long noncoding RNA lnc-DC controls human dendritic cell differentiation

    Science

    (2014)
  • BallantyneM.D. et al.

    LncRNA/microRNA interactions in the vasculature

    Clin. Pharmacol. Ther.

    (2016)
  • LiuB. et al.

    Identification of real microrna precursors with a pseudo structure status composition approach

    Plos One

    (2015)
  • LiuB. et al.

    IMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach

    J. Biomol. Struct. Dyn.

    (2016)
  • WangW. et al.

    Identification of miRNA lncRNA and mRNA-associated ceRNA networks and potential biomarker for MELAS with mitochondrial DNA A3243G mutation

    Sci. Rep.

    (2017)
  • HeJ.H. et al.

    Analyzing the lncrna, mirna, and mrna regulatory network in prostate Cancer with bioinformatics software

    J. Comput. Biol. J. Comput. Mol. Cell Biol.

    (2017)
  • ZhangC.

    MicroRNA-145 in vascular smooth muscle cell biology: A new therapeutic target for vascular disease

    Cell Cycle

    (2009)
  • ChenX. et al.

    EGBMMDA: Extreme gradient boosting machine for MiRNA-disease association prediction

    Cell Death Dis.

    (2018)
  • ChenX. et al.

    MicroRNAs and complex diseases: from experimental results to computational models

    (2019)
  • Cited by (105)

    View all citing articles on Scopus

    No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys.2019.105261.

    View full text