Abstract
The classification of the hundreds of papillomaviruses (PVs) still constitutes a major issue in virology, disease diagnosis, and therapy. Since 2003, PVs are classified within three levels of hierarchical clusters according to their similarity and their position in the phylogenetic tree, using the DNA sequence of the L1 gene. With the increased number of sequenced genomes, the boundaries of the different clusters within the different levels might overlap and the topology of the associated tree could change, thus avoiding a unique and coherent classification. Here, we studied the classification of 560 currently available human PVs (HPV) with respect to the criteria established 10 years ago as well as novel ones. The results highlight that current taxonomic identification does fit with the monophyletic criteria for the L1 gene, but the sequence similarity criteria violates the established boundaries to classify PVs. Finally, we argue that the substitution of L1 gene similarity by the whole genome similarity would allow to have less overlap between the different clusters and provide a better classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Antonsson, A., Forslund, O., Ekberg, H., Sterner, G., & Hansson, B. G. (2000). The ubiquity and impressive genomic diversity of human skin papillomaviruses suggest a commensalic nature of these viruses. Journal of Virology, 74, 11636–11641.
Benson, D. A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., et al. (2013). GenBank. Nucleic Acids Research, 41, D36–D42.
Bernard, H. U., Burk, R. D., Chen, Z., Van Doorslaer, K., Zur Hausen, H., & De Villiers, E. M. (2010). Classification of papillomaviruses (PVs) based on 189 PV types and proposal of taxonomic amendments. Virology, 401, 70–79.
Burk, R. D., Chen, Z., & Van Doorslaer, K. (2009). Human papillomaviruses: Genetic basis of carcinogenicity. Public Health Genomics, 12, 281–290.
Castresana, J. (2000). Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Molecular Biology and Evolution, 17, 540–552.
De Villiers, E. M., Fauquet, C., Broker, T. R., Bernard, H. U., & Zur Hausen, H. (2004). Classification of papillomaviruses. Virology, 324, 17–27.
Diallo, A. B., Badescu, D., Blanchette, M., & Makarenkov, V. (2009). A whole genome study and identification of specific carcinogenic regions of the human papilloma viruses. Journal of Computational Biology, 16, 1461–1473.
Edgar, R. C. (2004). MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32, 1792–1797.
Guindon, S., & Gascuel, O. (2003). A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology, 52(5), 696–704.
Handl, J., Knowles, J., & Kell, D. B. (2005). Computational cluster validation in post-genomic data analysis. Bioinformatics, 21, 3201–3212.
Hasegawa, M., Kishino, H., & Yano, T. (1985). Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution, 22(2), 160–174.
Liu, Y., Li, Z., Xiong, H., Gao, X., & Wu, J. (2010). Understanding of internal clustering validation measures. In 2010 IEEE 10th International Conference on Data Mining (ICDM) (pp. 911–916).
Muñoz, N., Bosch, F. X., De Sanjosé, S., Herrero, R., Castellsagué, X., Shah, K. V., et al. (2003). Epidemiologic classification of human papillomavirus types associated with cervical cancer. New England Journal of Medicine, 348, 518–527.
Narechania, A., Chen, Z., Desalle, R., & Burk, R. D. (2005). Phylogenetic incongruence among oncogenic genital alpha human papillomaviruses. Journal of Virology, 79, 15503–15510.
Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66, 846–850.
Rijsbergen, C. J. V. (1979). Information retrieval (2nd ed.). Newton: Butterworth-Heinemann.
Robinson, D. F., & Foulds, L. R. (1981). Comparison of phylogenetic trees. Mathematical Biosciences, 53, 131–147.
Rousseeuw, P. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.
Tota, J. E., Chevarie-Davis, M., Richardson, L. A., Devries, M., & Franco, E. L. (2011). Epidemiology and burden of HPV infection and related diseases: Implications for prevention strategies. Preventive Medicine, 53(1), S12–S21.
Van Doorslaer, K., Tan, Q., Xirasagar, S., Bandaru, S., Gopalan, V., Mohamoud, Y., et al. (2013). The papillomavirus episteme: A central resource for papillomavirus sequence data and analysis. Nucleic Acids Research, 41, D571–D578.
Zheng, Z. M., & Baker, C. C. (2006). Papillomavirus genome structure, expression, and post-transcriptional regulation. Frontiers in Bioscience: A Journal and Virtual Library, 11, 2286–2302.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Daigle, B., Makarenkov, V., Diallo, A.B. (2015). Effect of Hundreds Sequenced Genomes on the Classification of Human Papillomaviruses. In: Lausen, B., Krolak-Schwerdt, S., Böhmer, M. (eds) Data Science, Learning by Latent Structures, and Knowledge Discovery. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44983-7_27
Download citation
DOI: https://doi.org/10.1007/978-3-662-44983-7_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44982-0
Online ISBN: 978-3-662-44983-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)