Predicting lncRNA–miRNA interactions based on logistic matrix factorization with neighborhood regularized☆
Introduction
With the rapid development of sequencing technology, a variety of non-coding RNA (ncRNA) data sets [1], [2], [3] have been gradually collected to solve genomic and biological problems. Meanwhile, long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) have received widespread attention during recent years [4], [5]. LncRNAs are a class of ncRNAs with length of more than 200 nucleotides, which have been confirmed to play key roles in complex cellular processes such as epigenetic regulation of gene expression [6], chromatin modification [7] and cell differentiation [8]. In contrast, miRNAs are short regulatory RNAs of approximately 20 nucleotides in length with no protein-coding ability. MiRNAs are also widely involved in a variety of physiological and pathological processes such as development [9], cell proliferation [10], apoptosis and tumorigenesis [11].
A number of studies have reported the effects of miRNAs on lncRNA function [12], [13]. For example, lncRNA–miRNA regulatory networks in gastric cancer [5], colon cancer [9], prostate cancer [14], and vascular disease [15] have been studied. A model for the interaction between lncRNAs and miRNAs has been established and many algorithms can be applied.
The development of new biomarker discovery and therapeutic approaches has been facilitated by a comprehensive and in-depth understanding of the relationship between lncRNA–miRNA and its role in pathophysiology. However, the interaction between lncRNA–miRNAs is still too limited by biological experiments to find the association of lncRNA–miRNAs in large amounts of data. In fact, a quantity of computational methods have been successfully used in assisted biological experiments and are widely used in bioinformatics, such as disease-related relationship prediction [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], drug-target interaction prediction [26] and lncRNA–protein associations prediction [27], [28], [29], [30], [31], [32]. Still, only a few models can be used to forecast lncRNA–miRNA associations. Until recently, Huang et al. [33] first proposed a graph-based method for large-scale prediction of lncRNA–miRNA interactions, which is based on the expression profile similarity to predict potential lncRNA and miRNA associations, ignoring the effects of sequences as essential features of RNA. Using experimental validated data based on sequence similarity, we propose a network-based lncRNA–miRNA interaction prediction model called logistic matrix factorization with neighborhood regularized (LMFNRLMI). Computational models can make discovery more efficient than the complex processes and high costs of experimentation. This algorithm is based on the known interaction information and the similarity of the neighbors to represent the latent factor vector of the matrix factorization into a probability score through the logic function. The idea of k-nearest neighbors is incorporated that k-strongest similarities of the samples are considered and integrated into the regularization items to improve forecasting accuracy. The 5-fold cross validation and leave-one-out cross validation (LOOCV) are used to evaluate our model, with AUC values of 0.9220 and 0.9319, respectively. We evaluate the performance of LMFNRLMI by comparing with other six methods to prove that our method works best. In addition, we use the expression profile similarity and functional similarity of lncRNA and miRNA to verify the generalization ability of the model. The AUC values based on the expression profile similarity and functional similarity are 0.8591 and 0.9131, respectively. The result shows that LMFNRLMI is feasible and effective for lncRNA–miRNA prediction.
Section snippets
Logical matrix factorization
Matrix factorization algorithms have been developed in recommendation systems and have recently applied research on RNA and protein relationship prediction [31]. Here, the observation matrix is decomposed into two low-dimensional matrices and to find the inner product, where is the number of hidden factors. The probabilistic method is used, that is, the probability of occurrence of this event is allocated according to the inner product of the latent factor vector between lncRNA and
Regularized by neighborhood
Although the LMF algorithm can exploit latent vectors to predict potential relationships globally, strong associations of lncRNA–lncRNA and miRNA–miRNA are not considered. Therefore, we propose to use the adjacency matrix of lncRNAs and miRNAs to further improve the prediction accuracy of predicting lncRNA–miRNA interactions by K-nearest neighbors. For example, for lncRNA , we use to represent the nearest neighbor set to lncRNA , which contains lncRNAs. In the same way, we
Data set
For our investigation, the known lncRNA–miRNA associations are downloaded from lncRNASNP2 database (http://bioinfo.life.hust.edu.cn/lncRNASNP#!/) [37]. lncRNASNP2 integrates alldatabases which is powerful and can provides comprehensive information on SNPs and mutations in lncRNAs, including experimentally validated lncRNA–miRNA pairs, with the latest records reaching 10,597 interactions. 780 different types of lncRNA and 275 different types of miRNAs are included in the records of these
Conclusion
Since the mutual regulation of lncRNA and miRNA in biological mechanism deeply affects human diseases, and experimental verification has high cost and time-consuming limitations, it is an important direction that establishing a lncRNA–miRNA association prediction model in current research. So far, information about lncRNA and miRNA interactions is still limited, but the number of experimental validations is gradually increasing. Relying on reliable experimental data, we build a model called
Acknowledgments
This work was supported by the Doctor Startup Foundation from Liaoning province, China under Grant No. 20170520217, Important Scientific and Technical Achievements Transformation Project, China under Grant No. Z17-5-078, Large-scale Equipment Shared Services Project, China under Grant No. F15165400 and Applied Basic Research Project, China under Grant No. F16205151.
References (41)
- et al.
Functional interactions among microRNAs and long noncoding RNAs
Semin. Cell Dev. Biol.
(2014) - et al.
IRWNRLPI: Integrating random walk and neighborhood regularized logistic matrix factorization for lncrna-protein interaction prediction
Front. Genet.
(2018) - et al.
The bipartite network projection-recommended algorithm for predicting long non-coding RNA-protein interactions
Mol. Ther. Nucleic Acids
(2018) - et al.
HLPI-ensemble: Prediction of human lncRNA-protein interactions based on ensemble strategy
Rna Biol.
(2018) - et al.
LncRNADisease: a database for long-non-coding RNA-associated diseases
Nucleic Acids Res.
(2013) - et al.
LNCipedia: a database for annotated human lncRNA transcript sequences and structures
Nucleic Acids Res.
(2013) - et al.
LncRNASNP: a database of SNPs in lncrnas and their potential functions in human and mouse
Nucleic Acids Res.
(2015) - et al.
The interaction between MiR-141 and lncRNA-H19 in regulating cell proliferation and migration in gastric Cancer
Cell. Physiol. Biochem.
(2015) - et al.
Bioinformatics method to predict two regulation mechanism: TF-miRNA-mRNA and lncRNA-miRNA-mRNA in pancreatic cancer
Cell Biochem. Biophys.
(2014) Epigenetic regulation of gene expression
J. Braz. Chem. Soc.
(2012)
Correction: Long noncoding RNA HOTAIR regulates polycomb-dependent chromatin modification and is associated with poor prognosis in colorectal Cancers
Cancer Res.
The STAT3-binding long noncoding RNA lnc-DC controls human dendritic cell differentiation
Science
LncRNA/microRNA interactions in the vasculature
Clin. Pharmacol. Ther.
Identification of real microrna precursors with a pseudo structure status composition approach
Plos One
IMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach
J. Biomol. Struct. Dyn.
Identification of miRNA lncRNA and mRNA-associated ceRNA networks and potential biomarker for MELAS with mitochondrial DNA A3243G mutation
Sci. Rep.
Analyzing the lncrna, mirna, and mrna regulatory network in prostate Cancer with bioinformatics software
J. Comput. Biol. J. Comput. Mol. Cell Biol.
MicroRNA-145 in vascular smooth muscle cell biology: A new therapeutic target for vascular disease
Cell Cycle
EGBMMDA: Extreme gradient boosting machine for MiRNA-disease association prediction
Cell Death Dis.
MicroRNAs and complex diseases: from experimental results to computational models
Cited by (105)
scAAGA: Single cell data analysis framework using asymmetric autoencoder with gene attention
2023, Computers in Biology and MedicineMulti-view graph neural network with cascaded attention for lncRNA-miRNA interaction prediction
2023, Knowledge-Based SystemsSelf-representative kernel concept factorization
2023, Knowledge-Based Systems
- ☆
No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys.2019.105261.