Abstract
Hidden Markov models (HMMs) were developed and implemented to discriminate between each of the 2 ages, 11 call-types, and 51 speakers of birds using cross-validation on the recordings in the 3314 database for chick (19–25 days of age) and adult (60 days–7 years of age) vocalizations of Zebra Finches (Taeniopygia guttata). By applying both temporal [delta (velocity) and delta-delta (acceleration) coefficients] and spectral [Mel-Frequency Cepstral Coefficients (MFCCs)] features, the HMMs produced excellent performance with accuracies on the three tasks: (1) 96.68% (age recognition); (2) 94.62% (chicks) and 79.30% (adults) (call-type classification); and (3) 55.32% (12 speakers, chicks) and 16.78% (33 speakers, adults) to 100.00% (2 speakers, chicks), and 100.00% (3 speakers adults) (speaker identification). Based on the performances, the HMMs could be extended to other animals for automatic recognition, classification, and identification tasks.
Similar content being viewed by others
Data availability
N/A.
References
Austad, S. (1997). Birds as models of aging in biomedical research. ILAR Journal, 38(3), 137–140.
Baum, L. E., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probability functions of Markov chains. The Annals of Mathematical Statistics, 41(1), 164–171.
Bianco, M., Gerstoft, P., Traer, J., Ozanich, E., Roch, M., Gannot, S., & Deledalle, C. (2019). Machine learning in acoustics: Theory and applications. The Journal of the Acoustical Society of America, 146(5), 3590–3628.
Brown, C., & Riede, T. (2017). Comparative bioacoustics: An overview. Bentham Science Publishers.
Clemins, P. J. (2005). Automatic classification of animal vocalizations. Marquette University.
Clemins, P. J., Johnson, M. T., Leong, K. M., & Savage, A. (2005). Automatic classification and speaker identification of African Elephant (Loxodonta africana) vocalizations. The Journal of the Acoustical Society of America, 117(2), 956–963.
Davis, S. B., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), 357–366.
Elie, J., & Theunissen, F. (2016). The vocal repertoire of the domesticated zebra finch: A data driven approach to decipher the information-bearing acoustic features of communication signals. Animal Cognition, 19(2), 285–315.
Elie, J., & Theunissen, F. (2018). Zebra Finches identify individuals using vocal signatures unique to each call type. Nature Communication, 9, 1–11.
Fischer, R. (1998). Guide to owning a Zebra Finch. T.F.H. Publications Inc.
Forney, G. (1973). The Viterbi algorithm. Proceedings of IEEE, 61(3), 268–278.
Huang, X., Acero, A., & Hon, H.-W. (2001). Spoken language processing. Prentice-Hall Inc.
Ji, A., Johnson, M., Walsh, E., McGee, J., & Armstrong, D. (2013). Discrimination of individual tigers (Panthera tigris) from long distance roars. The Journal of the Acoustical Society of America, 133(3), 1762–1769.
Juang, B., Levinson, S. E., & Sondhi, M. (1986). Maximum likelihood estimation for multivariate mixture observations of Markov chains. IEEE Transactions on Information Theory, 32(2), 307–309.
Kvsn, R. R., Montgomery, J., Garg, S., & Charleston, M. (2020). Bioacoustics data analysis—A taxonomy, survey and open challenges. IEEE Access, 8, 57684–57708.
McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biology, 5, 115–133.
Mcloughlin, M., Stewart, R., & McElligott, A. (2019). Automated bioacoustics: Methods in ecology and conservation and their potential for animal welfare monitoring. Journal of the Royal Society Interface, 16, 1–12.
Moon, T. K. (1996). The expectation-maximization algorithm. IEEE Signal Processing Magazine, 13(6), 47–60.
Rabiner, L., & Juang, B. (1986). An introduction to hidden Markov models. IEEE ASSP Magazine, 3(1), 4–16.
Ren, Y., Johnson, M. T., Clemins, P. J., Darre, M., Glaeser, S. S., Osiejuk, T. S., & Out-Nyarko, E. (2009). A framework for bioacoustic vocalization analysis using hidden Markov models. Algorithms, 2(4), 1410–1428.
Seyfarth, R., & Cheney, D. (2003). Signalers and receivers in animal communication. Annual Review of Psychology, 54, 145–173.
Slater, P. (2009). The slater field guide to Australian birds. New Holland Publishers.
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society: Series B (Methodological), 36(2), 111–147.
Stowell, D., Petruskova, T., Salek, M., & Linhart, P. (2019). Automatic acoustic identification of individuals in multiple species: Improving identification across recording conditions. Journal of the Royal Society Interface, 16, 1–13.
Trawicki, M. (2021). Multispecies discrimination of whales (cetaceans) using hidden Markov models (HMMs). Ecological Informatics, 61, 101223.
Trawicki, M. B., & Johnson, M. T. (2005). Automatic song-type classification and speaker identification of norwegian ortolan bunting (Emberiza hortulana) vocalizations. In 2005 IEEE workshop on machine learning for signal processing. Mystic.
Von Bekesy, G. (1989). Experiments in hearing. McGraw-Hill Book Company.
Vriends, M. (1997). The Zebra Finch. Howell Book House.
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., & Woodland, P. (2009). Hidden Markov model toolkit (HTK) (version 3.4). Cambridge University Engineering Department.
Zann, R. (1996). The Zebra Finch: A synthesis of field and laboratory studies. Oxford University Press.
Funding
N/A.
Author information
Authors and Affiliations
Contributions
Author was the sole contributor to the research work.
Corresponding author
Ethics declarations
Conflict of interest
Author declare that has no competing interest.
Ethical approval
Author maintained the highest level of integrity in the research work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Trawicki, M.B. Automatic age recognition, call-type classification, and speaker identification of Zebra Finches (Taeniopygia guttata) using hidden Markov models (HMMs). Int J Speech Technol 26, 641–650 (2023). https://doi.org/10.1007/s10772-023-10041-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-023-10041-0