Abstract
Stroke is a challenging disease to diagnose in an emergency room (ER) setting. While an MRI scan is very useful in detecting ischemic stroke, it is usually not available due to space constraint and high cost in the ER. Clinical tests like the Cincinnati Pre-hospital Stroke Scale (CPSS) and the Face Arm Speech Test (FAST) are helpful tools used by neurologists, but there may not be neurologists immediately available to conduct the tests. We emulate CPSS and FAST and propose a novel multimodal deep learning framework to achieve computer-aided stroke presence assessment over facial motion weaknesses and speech inability for patients with suspicion of stroke showing facial paralysis and speech disorders in an acute setting. Experiments on our video dataset collected on actual ER patients performing specific speech tests show that the proposed approach achieves diagnostic performance comparable to that of ER doctors, attaining a 93.12% sensitivity rate while maintaining 79.27% accuracy. Meanwhile, each assessment can be completed in less than four minutes. This demonstrates the high clinical value of the framework. In addition, the work, when deployed on a smartphone, will enable self-assessment by at-risk patients at the time when stroke-like symptoms emerge.
M. Yu, T. Cai, X. Huang, and J.Z. Wang are supported by Penn State University. S.T.C. Wong and K. Wong are supported by the T.T. and W.F. Chao Foundation and the John S. Dunn Research Foundation.
M. Yu and T. Cai—Made equal contributions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This study received IRB approval: Houston Methodist IRB protocol No. Pro00020577, Penn State IRB site No. SITE00000562.
- 2.
Codes are available in https://github.com/0CTA0/MICCAI20_MMDL_PUBLIC.
References
Claes, P., Walters, M., Vandermeulen, D., Clement, J.G.: Spatially-dense 3D facial asymmetry assessment in both typical and disordered growth. J. Anat. 219(4), 444–455 (2011)
Dennis, J., Tran, H.D., Li, H.: Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process. Lett. 18(2), 130–133 (2010)
Dong, J., Lin, Y., Liu, L., Ma, L., Wang, S.: An approach to evaluation of degree of facial paralysis based on image processing and pattern recognition. J. Inf. Comput. Sci. 5(2), 639–646 (2008)
Frey, M., et al.: Three-dimensional video analysis of the paralyzed face reanimated by cross-face nerve grafting and free gracilis muscle transplantation: Quantification of the functional outcome. Plast. Reconstr. Surg. 122(6), 1709–1722 (2008)
Giles, E., Patterson, K., Hodges, J.R.: Performance on the Boston cookie theft picture description task in patients with early dementia of the alzheimer’s type: missing information. Aphasiology 10(4), 395–408 (1996)
Guo, Z., et al.: An unobtrusive computerized assessment framework for unilateral peripheral facial paralysis. IEEE J. Biomed. Health Inform. 22(3), 835–841 (2017)
Guo, Z., et al.: Deep assessment process: objective assessment process for unilateral peripheral facial paralysis via deep convolutional neural network. In: Proceedings of the IEEE International Symposium on Biomedical Imaging, pp. 135–138 (2017)
Harbison, J., Hossain, O., Jenkinson, D., Davis, J., Louw, S.J., Ford, G.A.: Diagnostic accuracy of stroke referrals from primary care, emergency room physicians, and ambulance staff using the face arm speech test. Stroke 34(1), 71–76 (2003)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
He, S., Soraghan, J.J., O’Reilly, B.F.: Automatic motion feature extraction with application to quantitative assessment of facial paralysis. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 441–444 (2007)
Horta, R., Aguiar, P., Monteiro, D., Silva, A., Amarante, J.M.: A facegram for spatial-temporal analysis of facial excursion: applicability in the microsurgical reanimation of long-standing paralysis and pretransplantation. J. Cranio-Maxillofacial Surg. 42(7), 1250–1259 (2014)
Hsu, G.S.J., Chang, M.H.: Deep hybrid network for automatic quantitative analysis of facial paralysis. In: Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 1–7 (2018)
Hu, Y., Chen, L., Zhou, Y., Zhang, H.: Estimating face pose by facial asymmetry and geometry. In: Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition (F&G), pp. 651–656 (2004)
Johnson, W., Onuma, O., Owolabi, M., Sachdev, S.: Stroke: a global response is needed. Bull. World Health Organ. 94(9), 634 (2016)
Khairunnisaa, A., Basah, S.N., Yazid, H., Basri, H.H., Yaacob, S., Chin, L.C.: Facial-paralysis diagnostic system based on 3D reconstruction. In: AIP Conference Proceedings, vol. 1660, p. 070026. AIP Publishing (2015)
Kothari, R.U., Pancioli, A., Liu, T., Brott, T., Broderick, J.: Cincinnati prehospital stroke scale: reproducibility and validity. Ann. Emerg. Med. 33(4), 373–378 (1999)
Leira, E.C., Kaskie, B., Froehler, M.T., Adams Jr., H.P.: The growing shortage of vascular neurologists in the era of health reform: Planning is brain!. Stroke 44(3), 822–827 (2013)
Li, P., et al.: A two-stage method for assessing facial paralysis severity by fusing multiple classifiers. In: Sun, Z., He, R., Feng, J., Shan, S., Guo, Z. (eds.) CCBR 2019. LNCS, vol. 11818, pp. 231–239. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31456-9_26
Anping, S., Guoliang, X., Xuehai, D., Jiaxin, S., Gang, X., Wu, Z.: Assessment for facial nerve paralysis based on facial asymmetry. Aust. Phys. Eng. Sci. Med. 40(4), 851–860 (2017). https://doi.org/10.1007/s13246-017-0597-4
Soraghan, J.J., O’Reilly, B.F., McGrenary, S., He, S.: Automatic facial analysis for objective assessment of facial paralysis. In: Proceedings of the 1st International Conference on Computer Science from Algorithms to Applications, Cairo, Egypt (2009)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A closer look at spatiotemporal convolutions for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6450–6459 (2018)
Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 20–36. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_2
Wang, S., Li, H., Qi, F., Zhao, Y.: Objective facial paralysis grading based on \(p_{face}\) and EigenFlow. Med. Biol. Eng. Comput. 42(5), 598–603 (2004). https://doi.org/10.1007/BF02347540
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Yu, M. et al. (2020). Toward Rapid Stroke Diagnosis with Multimodal Deep Learning. In: Martel, A.L., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2020. MICCAI 2020. Lecture Notes in Computer Science(), vol 12263. Springer, Cham. https://doi.org/10.1007/978-3-030-59716-0_59
Download citation
DOI: https://doi.org/10.1007/978-3-030-59716-0_59
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59715-3
Online ISBN: 978-3-030-59716-0
eBook Packages: Computer ScienceComputer Science (R0)