Abstract
Reference Genes (RG) are constitutive genes required for the maintenance of basic cellular functions. Different high-throughput technologies are used to identify these types of genes, including RNA sequencing (RNA-seq), which allows measuring gene expression levels in a specific tissue or an isolated cell. In this paper, we present a new approach based on Generative Adversarial Network (GAN) and Support Vector Machine (SVM) to identify in-silico candidates for reference genes. The proposed method is divided into two main steps. First, the GAN is used to increase a small number of reference genes found in the public RNA-seq dataset of Escherichia coli. Second, a one-class SVM based on novelty detection is evaluated using some real reference genes and synthetic ones generated by the GAN architecture in the first step. The results show that increasing the dataset using the proposed GAN architecture improves the classifier score by 19%, making the proposed method have a recall score of 85% on the test data. The main contribution of the proposed methodology was to reduce the amount of candidate reference genes to be tested in the laboratory by up to 80%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amraee, S., Vafaei, A., Jamshidi, K., Adibi, P.: Abnormal event detection in crowded scenes using one-class SVM. SIViP 12(6), 1115–1123 (2018). https://doi.org/10.1007/s11760-018-1267-z
Berghoff, B.A., Karlsson, T., Källman, T., Wagner, E.G.H., Grabherr, M.G.: RNA-sequence data normalization through in silico prediction of reference genes: the bacterial response to DNA damage as case study. BioData Min. 10(1), 30 (2017). https://doi.org/10.1186/s13040-017-0150-8
Daramouskas, I., Kapoulas, V., Paraskevas, M.: Using neural networks for RSSI location estimation in LoRa networks. In: 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA), pp. 1–7. IEEE (2019)
Du, W., Hu, F., Yuan, S., Liu, C.: Selection of reference genes for quantitative real-time PCR analysis of photosynthesis-related genes expression in Lilium regale. Physiol. Mol. Biol. Plants 25(6), 1497–1506 (2019). https://doi.org/10.1007/s12298-019-00707-y
Franco, E.F., et al.: A clustering approach to identify candidates to housekeeping genes based on RNA-seq data. In: Kowada, L., de Oliveira, D. (eds.) BSB 2019. LNCS, vol. 11347, pp. 83–95. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-46417-2_8
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Hirose, M., Toyota, S., Ojima, N., Ogawa-Ochiai, K., Tsumura, N.: Principal component analysis for surface reflection components and structure in facial images and synthesis of facial images for various ages. Opt. Rev. 24(4), 517–528 (2017). https://doi.org/10.1007/s10043-017-0343-x
Kim, Y., Kim, Y., Kim, Y.H.: Evaluation of reference genes for gene expression studies using quantitative real-time PCR in Drosophila melanogaster after chemical exposures. J. Asia-Pac. Entomol. 23(2), 385–394 (2020)
Legány, C., Juhász, S., Babos, A.: Cluster validity measurement techniques. In: Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases, pp. 388–393. World Scientific and Engineering Academy and Society (WSEAS), Stevens Point (2006)
Pinto, A.C., et al.: Differential transcriptional profile of Corynebacterium pseudotuberculosis in response to abiotic stresses. BMC Genom. 15(1), 14 (2014)
Rocha, D.J.P., Santos, C.S., Pacheco, L.G.C.: Bacterial reference genes for gene expression studies by RT-qPCR: survey and analysis. Antonie Van Leeuwenhoek 108(3), 685–693 (2015). https://doi.org/10.1007/s10482-015-0524-1
Schölkopf, B., Williamson, R.C., Smola, A.J., Shawe-Taylor, J., Platt, J.C.: Support vector method for novelty detection. In: Advances in Neural Information Processing Systems, pp. 582–588 (2000)
Sengupta, T., Bhushan, M., Wangikar, P.P.: A computational approach using ratio statistics for identifying housekeeping genes from cDNA microarray data. IEEE/ACM Trans. Comput. Biol. Bioinf. 12(6), 1457–1463 (2015)
Vandesompele, J., et al.: Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol. 3(7), research0034-1 (2002). https://doi.org/10.1186/gb-2002-3-7-research0034
Wu, Y., et al.: Identification and evaluation of reference genes for quantitative real-time PCR analysis in Passiflora edulis under stem rot condition. Mol. Biol. Rep. 47(4), 2951–2962 (2020). https://doi.org/10.1007/s11033-020-05385-8
Yahaya, S.W., Langensiepen, C., Lotfi, A.: Anomaly detection in activities of daily living using one-class support vector machine. In: Lotfi, A., Bouchachia, H., Gegov, A., Langensiepen, C., McGinnity, M. (eds.) UKCI 2018. AISC, vol. 840, pp. 362–371. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-97982-3_30
Yu, J., Su, Y., Sun, J., Liu, J., Li, Z., Zhang, B.: Selection of stable reference genes for gene expression analysis in sweet potato (Ipomoea batatas L.). Mol. Cell. Probes 53, 101610 (2020)
Zhang, Q., et al.: Selection and validation of reference genes for RT-PCR expression analysis of candidate genes involved in morphine-induced conditioned place preference mice. J. Mol. Neurosci. 66(4), 587–594 (2018). https://doi.org/10.1007/s12031-018-1198-8
Acknowledgments
This study was financed by the Coordenação de Aperfeiçoamento de Pessoal de Nivel Superior - Brasil (CAPES), under the Program PROCAD-AMAZÔNIA, process no 88881.357580/2019-01.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Rueda, E.J., Ramos, R., Franco, E.F., Belo, O., Morais, J. (2020). One-Class SVM to Identify Candidates to Reference Genes Based on the Augment of RNA-seq Data with Generative Adversarial Networks. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2020. ICCSA 2020. Lecture Notes in Computer Science(), vol 12249. Springer, Cham. https://doi.org/10.1007/978-3-030-58799-4_51
Download citation
DOI: https://doi.org/10.1007/978-3-030-58799-4_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58798-7
Online ISBN: 978-3-030-58799-4
eBook Packages: Computer ScienceComputer Science (R0)