Prediction of novel pluripotent proteins involved in reprogramming of male Germline stem cells (GSCs) into multipotent adult Germline stem cells (maGSCs) by network analysis

https://doi.org/10.1016/j.compbiolchem.2018.08.001Get rights and content

Highlights

  • Various proteins such as Pou5f1 and Prdm14 are involved in the Germline stem cell maintenance.

  • Five Identified clusters were ranked according to the score.

  • Lama3, Lamc2, and Lama1 in cluster 4 involve in the P13K-Akt signaling pathway plays an important role in pluripotency.

Abstract

Germline stem cells (GSCs) are known to transmit genetic information from parents to offspring. These GSCs can undergo reprogramming to transform themselves into pluripotent stem cells, called as Multipotent adult Germline stem cells (maGSCs). The mechanism of the reprogramming of GSCs to maGSCs is elusive. To investigate novel factors that may govern the process of reprogramming, the RNA-seq data of both GSCs and maGSCs were retrieved and subjected to Tuxedo protocol using Galaxy server. Total 1558 differentially expressed genes were identified from the analysis. Protein sequence in the FASTA format of all 1558 differentially expressed genes was retrieved and submitted to Pluripred web server to predict whether the proteins were pluripotent or not. A total of 232 proteins were predicted as pluripotent, and to identify the novel proteins, these were submitted to STRING database to obtain an interaction map. The obtained interaction map was submitted to Cytoscape, and various apps such as MCODE and Centiscape were used to identify the clusters and centrality measures between the nodes of the generated network. Five clusters were identified and ranked according to their score. Novel pluripotent proteins like cadherin related cdh5, cdh10 were predicted. Phox2b, Nrp2, Dll1, Shh, Gbx2, Nodal, Lefty1, Wnt7b, Pitx2, fgf4, Pou5f1, Nanog, Tet1, trim8, alx2, Dppa2, Prdm14,Sox11, Esrrb were predicted to be involved in the stem cell development. Dppa2, Sox11, Sox2, Bmp4, Shh, and Otp were predicted to be involved in positive regulation of the stem cell proliferation. Pathway analysis further revealed that signaling pathways such as Wnt, Jak-Stat and PI3K may play important role in the pluripotency of the maGSCs. Novel proteins involved in pluripotency, which were predicted by our findings, can be experimentally researched in future.

Introduction

The Germline stem cells (GSCs) give rise to germ cells, which exhibit the peculiar property of inheriting genetic characteristics from generation to generation (Kim and Belmonte, 2011). GSCs acquire the potential to undergo unipotent differentiation into spermatozoa in adult male gonad (Kim and Belmonte, 2011; Liu et al., 2016). Experimental evidence depicts that the cellular state of GSCs is likely to be similar to that of embryonic stem cells (ESCs). Hence, it was suggested that GSCs can undergo certain spontaneous reprogramming mechanisms, under different culture conditions, to acquire pluripotency by transforming themselves into pluripotent stem cells (PSCs) termed as multipotent adult Germline stem cells (maGSCs) (Kanatsu-Shinohara et al., 2004; Ko et al., 2009). At present, PSC researchers employ the use of stem cells of embryonic origin; called ESCs, which are derived from inner cell mass of blastocyst (Kim and Belmonte, 2011; Kanatsu-Shinohara et al., 2004; Seandel et al., 2007). ESCs entail their usage with ethical issues and safety concern. GSCs offer promising substitutes of ESCs for their use as PSCs, as their use does not pose any immune rejections due to their autologous nature. Hence, the use of GSCs is also free of any ethical complications (Kim and Belmonte, 2011). Additional advantages come from GSCs having higher genome integrity than induced Pluripotent stem cells (iPSCs) (Ko et al., 2009) and having better gene expression profiles than iPSCs (Kim and Belmonte, 2011). The PSCs can also be derived from other mammalian germ cells such as primordial germ cells (PGCs) and, spermatogonial stem/progenitor cells (SSCs, SPCs) from adult mice and humans (Geijsen and Leanne Jones, 2008; Kerr et al., 2006). Experimental evidence suggests that during in vitro expansion, mouse GSCs of spermatogonial origin, having unipotent potential, undergoes spontaneous reprogramming and converts into maGSCs with a near pluripotent state (Kanatsu-Shinohara et al., 2004; Seandel et al., 2007). Although it has been established that maGSCs acquire pluripotency, the mechanism underlying the same is still unknown. But, it can be postulated that certain genetic and epigenetic changes are responsible for the same (Liu et al., 2016).

It can be suggested that GSCs possess a latent ESCs-like gene profile and it loses its commitment to developing into spermatozoa; while converting into maGSCs. During germline specification in embryo, certain specific transcription factors of ESCs are transcriptionally activated, and their expression is preserved in GSCs or SSCs, in turn, repressing the function of somatic genes in the embryo (Pesce et al., 1998; Arnold et al., 2011; Zheng et al., 2009). For example, transcription factors like Pou5f1 (also known as Oct4) along with Sox2 and Nanog forms the core transcription factors regulating the pluripotent machinery in ESCs required for stem cell renewal and controlling the expression of many differentiated genes (Boyer et al., 2005; Chew et al., 2006). It was observed that these core transcription factors regulate pluripotency of maGSCs along with other novel pluripotent factors. Another reason for pluripotency of maGSCs might be modifications of histones which are associated with transcription mechanism (Wang et al., 2007; Surani et al., 2007; Brownell et al., 1996).

In recent days, computational techniques like Bioinformatics, Machine learning methods and network-based approaches are being used to understand the relationships between the vast amounts of genomic data available. To identify the proteins involved in the maintenance of the pluripotency in maGSCs, we used computational methods to explain this elusive mechanism. In this study, RNA-Seq data of GSCs and maGSCs were retrieved and it was observed that maGSCs are distinguished from GSCs by repression of certain spermatogenesis regulators, inactivating their conversion to spermatozoa, in turn, activating certain somatic cell lineage genes establishing pluripotent nature of maGSCs. Based on these observations, we predicted certain novel transcription factors responsible for a pluripotent state of maGSCs. The up-regulated genes required for conversion of GSCs to maGSCs were identified using Galaxy server, and protein sequences of all up-regulated genes were submitted to Pluripred to predict pluripotent proteins amongst the up-regulated ones. The interconnections between different pluripotent proteins were studied in STRING. However, it was very difficult to predict the novel pluripotent factors required for the pluripotent state of maGSCs amongst the interactions between different proteins in the entire network. Hence, we further employed Cytoscape for network analysis of our proteins. Centiscape refined our analysis by including certain centralities using MCODE and Centiscape thereby, predicting certain novel transcription factors responsible for establishing the pluripotent state in maGSCs.

Section snippets

RNA-seq data analysis

The RNA-seq data files of GSC and maGSC cells with SRA ID SRP070581 (Liu et al., 2016) in the fastaq format were retrieved. Total 10 samples were identified, which included 4 samples of GSCs and 6 samples of maGSCs, and were considered for identifying the differentially expressed genes and submitted to Galaxy server (Afgan et al., 2016). The obtained reads were checked for the quality control for calculating the read counts, means and quartiles of the particular base with in the sequence and

Identification of up-regulated genes and pluripred analysis

RNA- seq data analysis using Galaxy web server (Afgan et al., 2016) was executed using the Tuxedo protocol (Trapnell et al., 2012). The aligned reads were represented as mapping percentage table in supplementary data (Table 1). A total of 1558 up-regulated genes with the fold change greater than or equal to 2 were identified from the analysis of 4 replicates. Pluripred web server (Margelevičius and Venclovas, 2005) was used to predict the number of pluripotent proteins. The total number of

Conclusion

In conclusion, through a combination of network biology and machine learning, this study identified 1558 differentially expressed genes between GSCs and maGSCs among which 232 proteins were predicted to be related to pluripotency. Five clusters of pluripotent proteins could be identified with Sox2, Dnmt3, Cdh10 and Agrn as seed nodes: 5th cluster did not had any seed node. The Bmp4, Dnmt3b, Casp3, agrn and Dbx1 were bottleneck nodes in these clusters, respectively. The cdh5, cdh10 were

Acknowledgement

Department of Biotechnology Ministry of science and technology.

References (38)

  • M.Azim Surani et al.

    Genetic and epigenetic regulators of pluripotency

    Cell

    (2007)
  • Gang G. Wang et al.

    Chromatin remodeling and cancer, Part II: ATP-dependent chromatin remodeling

    Trends Mol. Med.

    (2007)
  • Liangtang Wu et al.

    Histone demethylases KDM4A and KDM4C regulate differentiation of embryonic stem cells to endothelial cells

    Stem Cell Rep.

    (2015)
  • Yukari Yamaguchi et al.

    Nanog positively regulates Zfp57 expression in mouse embryonic stem cells

    Biochem. Biophys. Res. Commun.

    (2014)
  • Masashi Yamaji et al.

    PRDM14 ensures naive pluripotency through dual regulation of signaling and epigenetic pathways in mouse embryonic stem cells

    Cell Stem Cell

    (2013)
  • Enis Afgan et al.

    The galaxy platform for accessible, reproducible and collaborave biomedical analyses: 2016 update

    Nucleic Acids Res.

    (2016)
  • J.L. Chew et al.

    The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells

    Nat. Genet.

    (2006)
  • Peter J.A. Cock et al.

    The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants

    Nucleic Acids Res.

    (2009)
  • Niels Geijsen et al.

    Seminal discoveries in regenerative medicine: contributions of the male germ line to understanding pluripotency

    Hum. Mol. Genet.

    (2008)
  • Cited by (7)

    • Prediction and Boolean logical modelling of synergistic microRNA regulatory networks during reprogramming of male germline pluripotent stem cells

      2021, BioSystems
      Citation Excerpt :

      The DEGs greater than fold change value 2 were considered for the analysis. Besides, DEGs identified in our previous study (Guttula et al., 2018), using Tuxedo protocol by considering the False Discovery Rate (FDR) of 0.05 and logFC value of 2 from RNAseq data of GS and GPS, were combined, and replicates of genes were removed from both microarray and RNA-Seq data for obtaining non-redundant dataset for further analysis. miRNAs that target the redundant dataset of DEGs were identified using miRWalk (Dweep et al., 2014) by combining 12 well-known miRNA target prediction tools namely, miRanda, TargetScan, miRDB, mirTarBase, MicroT4, miRBridge, miRMap, MiRNAMap, PICTAR2, PITA, RNA22 and RNAhybrid.

    • Comparison of spermatozoal RNA extraction methods in goats

      2021, Analytical Biochemistry
      Citation Excerpt :

      The summary of the trimmed FASTQ files were plotted using NanoPlot (Galaxy Version 1.28.2+galaxy1). The reads (after trimming) were mapped to the goat genome ARS1 (GCA_001704415.1) using Minimap2 (Galaxy Version 2.17+galaxy2) [36]. Transcript expression levels were calculated based on Transcript Per million (TPM) values using StringTie (Galaxy Version 2.1.1) for the assembly and quantification of the transcripts.

    • A Boolean Logical model for Reprogramming of Testes-derived male Germline Stem Cells into Germline pluripotent stem cells

      2020, Computer Methods and Programs in Biomedicine
      Citation Excerpt :

      The GS and GPS cells can respond differently to growth factors such as glial-derived neurotrophic factor (GDNF) and leukemia inhibitory factor (LIF) to partially switch from GPS cells to GS cells and vice versa through a pathway that involved epigenetic reprogramming of DNA methylation and imprinted microRNA [4,8]. On the other hand network analysis identified some of the novel pluripotent genes in GPS cells by calculating the centrality measures within the network [9]. However, a clear molecular mechanism of nuclear reprogramming of GS cells into GPS cells remains elusive.

    View all citing articles on Scopus
    View full text