skip to main content
10.1145/3607947.3607955acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesic3Conference Proceedingsconference-collections
research-article

Audio Examination – Mental Health Diagnosis in Healthcare through Audio Analytics

Published:28 September 2023Publication History

ABSTRACT

There is an ever-increasing amount of audio content being created and shared online. With so much data available, it can be challenging to make sense of it all and extract useful information. The utilization of audio analytics tools can assist organizations in comprehending the vast quantities of audio data at their disposal, thereby acquiring valuable insights into customer behavior, market trends, and business operations. Therefore, this paper presents an Audio Analytics Tool for the healthcare industry to recognize patients’ mental health.

References

  1. Bocklet, T. 2008. Age and gender recognition for telephone applications based on GMM supervectors and support vector machines. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. (2008), 1605–1608. DOI:https://doi.org/10.1109/ICASSP.2008.4517932.Google ScholarGoogle ScholarCross RefCross Ref
  2. Campbell, W.M. 2006. Support vector machines using GMM supervectors for speaker verification. IEEE Signal Processing Letters. 13, 5 (May 2006), 308–311. DOI:https://doi.org/10.1109/LSP.2006.870086.Google ScholarGoogle ScholarCross RefCross Ref
  3. Fine, S. 2001. A hybrid GMM/SVM approach to speaker identification. 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221). 1, (2001), 417–420. DOI:https://doi.org/10.1109/ICASSP.2001.940856.Google ScholarGoogle ScholarCross RefCross Ref
  4. Graves, A. 2013. Hybrid speech recognition with Deep Bidirectional LSTM. 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings. (2013), 273–278. DOI:https://doi.org/10.1109/ASRU.2013.6707742.Google ScholarGoogle ScholarCross RefCross Ref
  5. Hochreiter, S. and Schmidhuber, J. 1997. Long Short-Term Memory. Neural Computation. 9, 8 (Nov. 1997), 1735–1780. DOI:https://doi.org/10.1162/NECO.1997.9.8.1735.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Kim, S. 2016. Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. (Sep. 2016), 4835–4839. DOI:https://doi.org/10.1109/ICASSP.2017.7953075.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Lee, J. and Watanabe, S. 2021. Intermediate loss regularization for CTC-based speech recognition. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2021-June, (2021), 6224–6228. DOI:https://doi.org/10.1109/ICASSP39728.2021.9414594.Google ScholarGoogle ScholarCross RefCross Ref
  8. Likitha, M.S. 2017. Speech based human emotion recognition using MFCC. 2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET). 2018-January, (Feb. 2017), 2257–2260. DOI:https://doi.org/10.1109/WISPNET.2017.8300161.Google ScholarGoogle ScholarCross RefCross Ref
  9. Livieris, I.E. 2019. Gender Recognition by Voice Using an Improved Self-Labeled Algorithm. Machine Learning and Knowledge Extraction 2019, Vol. 1, Pages 492-503. 1, 1 (Mar. 2019), 492–503. DOI:https://doi.org/10.3390/MAKE1010030.Google ScholarGoogle ScholarCross RefCross Ref
  10. Nakagawa, S. 2012. Speaker identification and verification by combining MFCC and phase information. IEEE Transactions on Audio, Speech and Language Processing. 20, 4 (May 2012), 1085–1095. DOI:https://doi.org/10.1109/TASL.2011.2172422.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Oruh, J. 2022. Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition. IEEE Access. 10, (2022), 30069–30079. DOI:https://doi.org/10.1109/ACCESS.2022.3159339.Google ScholarGoogle ScholarCross RefCross Ref
  12. Palo, H.K. 2015. Use of different features for emotion recognition using MLP network. Advances in Intelligent Systems and Computing. 332, (2015), 7–15. DOI:https://doi.org/10.1007/978-81-322-2196-8_2.Google ScholarGoogle ScholarCross RefCross Ref
  13. Reynolds, D.A. and Rose, R.C. 1995. Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models. IEEE Transactions on Speech and Audio Processing. 3, 1 (1995), 72–83. DOI:https://doi.org/10.1109/89.365379.Google ScholarGoogle ScholarCross RefCross Ref
  14. Sinith, M.S. 2010. A novel method for text-independent speaker identification using MFCC and GMM. ICALIP 2010 - 2010 International Conference on Audio, Language and Image Processing, Proceedings. (2010), 292–296. DOI:https://doi.org/10.1109/ICALIP.2010.5684389.Google ScholarGoogle ScholarCross RefCross Ref
  15. Staudemeyer, R.C. and Morris, E.R. 2019. Understanding LSTM – a tutorial into Long Short-Term Memory Recurrent Neural Networks. (Sep. 2019).Google ScholarGoogle Scholar
  16. Yucesoy, E. and Nabiyev, V. V. 2013. Gender identification of a speaker using MFCC and GMM. ELECO 2013 - 8th International Conference on Electrical and Electronics Engineering. (2013), 626–629. DOI:https://doi.org/10.1109/ELECO.2013.6713922.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Audio Examination – Mental Health Diagnosis in Healthcare through Audio Analytics
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      IC3-2023: Proceedings of the 2023 Fifteenth International Conference on Contemporary Computing
      August 2023
      783 pages
      ISBN:9798400700224
      DOI:10.1145/3607947

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 September 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)32
      • Downloads (Last 6 weeks)5

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format