Skip to main content
Log in

Automatic speech recognition using interlaced derivative pattern for cloud based healthcare system

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Cloud computing brings several advantages such as flexibility, scalability, and ubiquity in terms of data acquisition, data storage, and data transmission. This can help remote healthcare among other applications in a great deal. This paper proposes a cloud based framework for speech enabling healthcare. In the proposed framework, the patients or any healthy person seeking for some medical assistance can send his/her request by speech commands. The commands are managed and processed in the cloud server. Any doctor with proper authentication can receive the request. By analyzing the request, the doctor can assist the patient or the person. This paper also proposes a new feature extraction technique, namely, interlaced derivative pattern (IDP), to automatic speech recognition (ASR) system to be deployed into the cloud server. The IDP exploits the relative Mel-filter bank coefficients along different neighborhood directions from the speech signal. Experimental results show that the proposed IDP-based ASR system performs reasonably well even when the speech is transmitted via smart phones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Mell, P., Grance, T.: The NIST definition of cloud computing. NIST Special Publication 800–145, National Institute of Standards and Technology (NIST), Sept (2011). http://www.csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf

  2. Bercovitz, A., Moss, A., Sengupta, M., Park-Lee, E.Y., Jones, A., Harris-Kojetin, L.D., Squillace, M.R.: An overview of home health aides: United States, 2007. National Health Statistics Reports, U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics, Number 34, May 19, (2011). http://www.cdc.gov/nchs/data/nhsr/nhsr034.pdf

  3. Kaur, P.D., Chan, I.: Cloud based intelligent system for delivering health care as a service. Comput. Methods Progr. Biomed. 113(1), 346–359 (2014)

    Article  Google Scholar 

  4. Hossain, M.S., Muhammad, G.: Cloud-based collaborative media service framework for health-care. Int. J. Distrib. Sensor Netw. 2014 (2014). doi:10.1155/2014/858712

  5. Muhammad, G., Masud, M., Alelaiwi, A., Rahman, M.A., Karime, A., Alamri, A., Hossain, M.S.: Spectro-temporal directional derivative based automatic speech recognition for a serious game scenario. Multimed. Tools Appl. (2014). doi:10.1007/s11042-014-1973-7

  6. Glasberg, R., Hartmann, M., Draheim, M., Tamm, G., Hessel, F.: Risks and crises for healthcare providers: the impact of cloud computing. Sci. World J. 2014 (2014)

  7. Oh, S.Y., Chung, K.Y.: Target speech feature extraction using non-parametric correlation coefficient. Clust. Comput. 17(3), 839–899 (2014)

    Google Scholar 

  8. Lee, D., Lee, H., Park, D., Jeong, Y.-S.: Proxy based seamless connection management method in mobile cloud computing. Clust. Comput. 16(4), 733–744 (2013)

    Article  Google Scholar 

  9. Jung, E.-Y., Kim, J., Chung, K.-Y., Park, D.K.: Mobile healthcare application with EMR interoperability for diabetes patients. Clust. Comput. 17(3), 871–880 (2014)

    Article  Google Scholar 

  10. Durling, S., Lumsden, J.: Speech recognition use in healthcare applications. In: Proceedings of the 6th International Conference on Advances in Mobile Computing and Multimedia (MoMM ’08), NY, USA, pp. 473–478 (2008)

  11. Hamill, M., Young, V., Boger, J., Mihailidis, A.: Development of an automated speech recognition interface for personal emergency response systems. J. Neuro Eng. Rehabil. 6(26), (2009)

  12. Fagan, M.J., Ell, S.R., Gilbert, J.M., Sarrazin, E., Chapman, P.M.: Development of a (silent) speech recognition system for patients following laryngectomy. Med. Eng. Phys. 30(4), 419–425 (2008)

    Article  Google Scholar 

  13. Muhammad, G., Mesallam, T.A., Malki, K.H., Farahat, M., Alsulaiman, M., Bukhari, M.: Formant analysis in dysphonic patients and automatic Arabic digit speech recognition. BioMed. Eng. OnLine 10, 41 (2011)

    Article  Google Scholar 

  14. Muhammad, G., Melhem, M.: Pathological voice detection and binary classification using MPEG-7 audio features. Biomed. Signal Process. Controls 11, 1–9 (2014)

    Article  Google Scholar 

  15. Alamri, A., Hassan, M.M., Hossain, M.A., Al-Qurishi, M., Aldukhayyil, Y., Hossain, M.S.: Evaluating the impact of a cloud-based serious game on obese people. Comput. Hum. Behav. 30, 468–475 (2014)

  16. Hossain, M.S., Muhammad, G.: Cloud-assisted speech and face recognition framework for health monitoring. Mobile Networks and Applications (MONET) (2015)

  17. Shobeirinejad, A., Gao, Y.: Gender classification using interlaced derivative patterns. In: Proceedings of the 20th International Conference on Pattern Recognition (ICPR), pp. 1509–1512 (2010)

  18. Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)

  19. Shan, S., Chen, J., He, C., Zhao, G., Pietikainen, M., Gao, W.: WLD: a robust local image descriptor. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1–3 (2010)

  20. Rabiner, L.R., Schafer, R.W.: Introduction to digital speech processing. Found. Trends Signal Process. 1, 1–194 (2007)

    Article  Google Scholar 

  21. Nakamura, S., Takeda, K., Yamamoto, K., Yamada, T., Kuroiwa, S., Kitaoka, N., Nishiura, T., Sasou, A., Mizumachi, M., Miyajima, C., Fujimoto, M., Endo, T.: AURORA-2J: an evaluation framework for Japanese noisy speech recognition. In: IEICE Transactions Information and Systems, E88-D(3), pp. 535–544 (2005)

  22. Muhammad, G., Mesallam, T.A., Almalki, K., Farahat, M., Mahmood, A., Alsulaiman, M.: Multi directional regression (MDR) based features for automatic voice disorder detection. J. Voice 26(6), 817.e19–817.e27 (2012)

    Article  Google Scholar 

  23. Hossain, M.S., El Saddik, A.: A biologically-inspired multimedia content repurposing system in heterogeneous network environments. ACM/Springer Multimed. Syst. J. 14(3), 135–143 (2008)

    Article  Google Scholar 

  24. Hossain, M.S., El Saddik, A.: Scalability Measurement of a proxy based personalized multimedia repurposing system. In: Proceedings of IEEE Instrumentation and Measurement Technology Conference (IEEE-IMTC’06), Sorrento, Italia, (2006)

  25. Martinez, D., Lleida, E., Ortega, A., Miguel, A., Villalba, J.: Voice pathology detection on the Saarbrucken voice database with calibration and fusion of scores using MultiFocal Toolkit. In: Toledano, D.T. et al. (eds.) IberSpeech 2012, CCIS, vol. 328, pp. 99–109 (2012)

Download references

Acknowledgments

This work was supported by the Research Center, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia under the Project RC121230.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ghulam Muhammad.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Muhammad, G. Automatic speech recognition using interlaced derivative pattern for cloud based healthcare system. Cluster Comput 18, 795–802 (2015). https://doi.org/10.1007/s10586-015-0439-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-015-0439-7

Keywords

Navigation