Abstract
Effective biological sequence analysis methods are in great demand. This is due to the increasing amount of sequence data being generated from the improved sequencing techniques. In this study, we select statistically significant features/motifs from the Position Specific Motif Matrices of promoters. Later, we reconstruct these matrices using the chosen motifs. The reconstructed matrices are then binarized using triangular fuzzy membership values. Then the binarized matrix is assigned weights to obtain the texture features. Histogram is plotted to visualize the distribution of texture values of each promoter and later histogram difference is computed across pairs of promoters. This histogram difference is a measure of underlying dissimilarity in the promoters being compared. A dissimilarity matrix is constructed using the histogram difference values of all the promoter pairs. From the experiments, the combination of feature reduction and fuzzy binarization seems to be useful in promoter differentiation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Qin, Y., Yalamanchili, H.K., Qin, J., Yan, B., Wang, J.: The current status and challenges in computational analysis of genomic big data. Big Data Res. 2(1), 12–18 (2015)
Landolin, J.M., Johnson, D.S., Trinklein, N.D., Aldred, S.F., Medina, C., Shulha, H., Myers, R.M.: Sequence features that drive human promoter function and tissue specificity. Genome Res. 20(7), 890–898 (2010)
Wray, G.A., Hahn, M.W., Abouheif, E., Balhoff, J.P., Pizer, M., Rockman, M.V., Romano, L.A.: The evolution of transcriptional regulation in Eukaryotes. Mol. Biol. Evol. 20(9), 1377–1419 (2003)
Ghiurcuta, C.G.: Models and Algorithms for Noncoding Genes. Edic Research Proposal (2009)
Hu, J., Zhao, H., Liang, X., Chen, D.: The analysis of similarity for promoter sequence structures in yeast genes. In: 2012 5th International Conference on IEEE Biomedical Engineering and Informatics (BMEI), pp. 919–922 (2012)
Reiter, L.T., Potocki, L., Chien, S., Gribskov, M., Bier, E.: A systematic analysis of human disease-associated gene sequences in Drosophila melanogaster. Genome Res. 11(6), 1114–1125 (2001)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(Mar), 1157–1182 (2003)
Liu, X., Krishnan, A., Mondry, A.: An entropy-based gene selection method for cancer classification using microarray data. BMC Bioinform. 6(1), 1 (2005)
Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Conilione, P., Wang, D.: A comparative study on feature selection for E.coli promoter recognition. Int. J. Inf. Technol. 11, 54–66 (2005)
López-de-Ipiña, K., Solé-Casals, J., Faundez-Zanuy, M., Calvo, P.M., Sesa, E., de Lizarduy, U.M., Bergareche, A.: Selection of entropy based features for automatic analysis of essential tremor. Entropy 18(5), 184 (2016)
Chitralegha, M., Thangavel, K.: A novel entropy based segment selection technique for extraction of protein sequence motifs. IJCSI Int. J. Comput. Sci. Issues 9(4), 314 (2012)
Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Thompson, J.D.: Clustal W and Clustal X version 2.0. Bioinformatics 23(21), 2947–2948 (2007)
Yan, R., Xu, D., Yang, J., Walker, S., Zhang, Y.: A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction. Sci. Rep. 3, 2619 (2013)
Kouser, K., Rangarajan, L., Chandrashekar, D.S., Kshitish, K.A., Abraham, E.M.: Alignment free frequency based distance measures for promoter sequence comparison. In: Ortuño, F., Rojas, I. (eds.) IWBBIO 2015: Bioinformatics and Biomedical Engineering. LNCS, vol. 9044, pp. 183–193. Springer, Cham (2015). doi:10.1007/978-3-319-16480-9_19
Kouser, K., Rangarajan, L.: Promoter sequence analysis through no gap multiple sequence alignment of motif pairs. Procedia Comput. Sci. 58, 356–362 (2015)
Kouser, K., and Rangarajan, L.: Similarity analysis of position specific motif matrices using lacunarity for promoter sequences. In: Proceedings of the 2014 International Conference on Interdisciplinary Advances in Applied Computing, p. 37. ACM (2014)
Zadeh, L.A.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)
Cintra, M.E., Camargo, H.A., Monard, M.C.: A study on techniques for the automatic generation of membership functions for pattern recognition. Congresso da Academia Trinacional de Ciências C3N(1), 1–10 (2008)
Medasani, S., Kim, J., Krishnapuram, R.: An overview of membership function generation techniques for pattern recognition. Int. J. Approxiamate Reasoning 19(3), 391–417 (1998)
Kaya, M., Alhajj, R.: A clustering algorithm with genetically optimized membership functions for fuzzy association rules mining. In: 12th IEEE International Conference on Fuzzy Systems 2, pp. 881–886 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kouser, K., Rangarajan, L. (2017). Feature Reduced Weighted Fuzzy Binarization for Histogram Comparison of Promoter Sequences. In: Santosh, K., Hangarge, M., Bevilacqua, V., Negi, A. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016. Communications in Computer and Information Science, vol 709. Springer, Singapore. https://doi.org/10.1007/978-981-10-4859-3_16
Download citation
DOI: https://doi.org/10.1007/978-981-10-4859-3_16
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-4858-6
Online ISBN: 978-981-10-4859-3
eBook Packages: Computer ScienceComputer Science (R0)