Differential Learning from Sparse and Noisy Labels for Robust Detection of Clinical Landmarks in Echo Cine Series

Mahdavi, Mobina; Vaseli, Hooman; Luong, Christina; Van Woudenberg, Nathan; Jafari, Mohammad; Abolmaesumi, Purang; Tsang, Teresa

doi:10.1007/978-3-031-16902-1_5

Mobina Mahdavi¹³,
Hooman Vaseli¹³,
Christina Luong¹⁴,
Nathan Van Woudenberg¹³,
Mohammad Jafari¹³,
Purang Abolmaesumi¹³ &
…
Teresa Tsang¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13565))

Included in the following conference series:

International Workshop on Advances in Simplifying Medical Ultrasound

755 Accesses

Abstract

Quantifying dimensions of the Left Ventricle (LV) chamber of the heart on echocardiography (echo) cine series is an essential step to assess the function of the LV. Training deep neural networks to automate such measurements is challenging because the gold standard clinical labels are noisy due to inherent observer variability. Also, the labels are only available for at most two time instances in the cine series, end-diastole (ED) and end-systole (ES). In this paper, we first present a multi-head U-Net based model to leverage all available annotations of LV Internal Diameter (LVID), Interventricular Septal (IVS), and Left Ventricular Posterior Wall (LVPW) in a multi-task landmark detection setting. The first head detects the inner landmarks of LV, or LVID (supervised on ED and ES frames), and the second head detects the outer landmarks of the LV (supervised only on ED frame). Secondly, we propose differential learning to further improve the model’s performance using an auxiliary task that is semantically relevant to the LVID measurement, but is more robust to inherent observer variability in the labels. This is done through comparing estimated Ejection Fraction (EF) based on the representations of a pair of input cines, by integrating the multi-head model as a Siamese network with an EF comparator network. We evaluate our proposed model on two independent datasets: 1) a large cart-based dataset, consisting of 28,577 echo cines obtained from 23,755 patients, where we demonstrate state-of-the-art performance compared to prior work; 2) 51 echo cines obtained from 23 heart-failure patients using a point-of-care ultrasound (POCUS) system. Imaging heart-failure patients is conventionally considered clinically more challenging. Our approach can be extended to tasks in which the labels are not sufficiently reliable for direct regression, yet their comparison as an auxiliary task is possible.

P. Abolmaesumi and T. Tsang—Joint senior authors.

M. Mahdavi, H. Vaseli and C. Luong—Joint first authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). www.tensorflow.org/, Software available from tensorflow.org
Brattain, L., Telfer, B., Dhyani, M., et al.: Machine learning for medical ultrasound: status, methods, and future opportunities. Abdom. Radiol. 43, 786–799 (2018)
Article Google Scholar
Chen, R., Ma, Y., Chen, N., Lee, D., Wang, W.: Cephalometric landmark detection by attentive feature pyramid fusion and regression-voting. In: Shen, D., et al. (eds.) Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, pp. 873–881. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32248-9_97
Chapter Google Scholar
Chollet, F., et al.: Keras (2015). github.com/fchollet/keras
Gilbert, A., Holden, M., Eikvil, L., Aase, S.A., Samset, E., McLeod, K.: Automated left ventricle dimension measurement in 2D cardiac ultrasound via an anatomically meaningful CNN approach. In: Wang, Q., et al. (eds.) PIPPI/SUSI -2019. LNCS, vol. 11798, pp. 29–37. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32875-7_4
Chapter Google Scholar
Jafari, M.H., et al.: U-LanD: uncertainty-driven video landmark detection. IEEE Trans. Med. Imaging 41(4), 793–804 (2021)
Article Google Scholar
Kumar, A., et al.: LUVLi face alignment: estimating landmarks’ location, uncertainty, and visibility likelihood. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8236–8246, June 2020
Google Scholar
Lin, J., et al.: Reciprocal landmark detection and tracking with extremely few annotations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15170–15179 (2021)
Google Scholar
Lv, J., Shao, X., Xing, J., Cheng, C., Zhou, X.: A deep regression architecture with two-stage re-initialization for high performance facial landmark detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3691–3700 (2017). https://doi.org/10.1109/CVPR.2017.393
McFarland, T.M., Alam, M., Goldstein, S., Pickard, S.D., Stein, P.D.: Echocardiographic diagnosis of left ventricular hypertrophy. Circulation 57(6), 1140–1144 (1978) https://doi.org/10.1161/01.CIR.57.6.1140, ahajournals.org
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision - ECCV 2016, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Chapter Google Scholar
Payer, C., Štern, D., Bischof, H., Urschler, M.: Regressing heatmaps for multiple landmark localization using CNNs. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016, pp. 230–238. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_27
Chapter Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Sofka, M., Milletari, F., Jia, J., Rothberg, A.: Fully convolutional regression network for accurate detection of measurement points. In: Cardoso, M.J., et al. (eds.) DLMIA/ML-CDS -2017. LNCS, vol. 10553, pp. 258–266. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67558-9_30
Chapter Google Scholar
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5686–5696 (2019). https://doi.org/10.1109/CVPR.2019.00584
Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3476–3483 (2013). https://doi.org/10.1109/CVPR.2013.446
Thavendiranathan, P., Popovic, Z., Flamm, S., Dahiya, A., Grimm, R., Marwick, T.: Improved inter-observer variability, accuracy and precision of echocardiographic visual LVEF assessment through a self-directed learning program using CMR images. J. Am. Coll. Cardiol. 61(10_Supplement), E1313–E1313 (2013). https://doi.org/10.1016/S0735-1097(13)61313-3
Thorstensen, A., Dalen, H., Amundsen, B.H., Aase, S.A., Stoylen, A.: Reproducibility in echocardiographic assessment of the left ventricular global and regional function, the hunt study. Eur. J. Echocardiogr. 11(2), 149–156 (2010)
Article Google Scholar
Toshev, A., Szegedy, C.: DeepPose: human pose estimation via deep neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1653–1660 (2014). https://doi.org/10.1109/CVPR.2014.214
Trigeorgis, G., Snape, P., Nicolaou, M.A., Antonakos, E., Zafeiriou, S.: Mnemonic descent method: a recurrent process applied for end-to-end face alignment. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4177–4187 (2016). https://doi.org/10.1109/CVPR.2016.453
Zhong, Z., Li, J., Zhang, Z., Jiao, Z., Gao, X.: An attention-guided deep regression model for landmark detection in cephalograms. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11769, pp. 540–548. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32226-7_60
Chapter Google Scholar
Zhu, S., Li, C., Loy, C.C., Tang, X.: Face alignment by coarse-to-fine shape searching. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4998–5006 (2015). https://doi.org/10.1109/CVPR.2015.7299134

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada
Mobina Mahdavi, Hooman Vaseli, Nathan Van Woudenberg, Mohammad Jafari & Purang Abolmaesumi
Vancouver General Hospital, Vancouver, BC, Canada
Christina Luong & Teresa Tsang

Authors

Mobina Mahdavi
View author publications
You can also search for this author in PubMed Google Scholar
Hooman Vaseli
View author publications
You can also search for this author in PubMed Google Scholar
Christina Luong
View author publications
You can also search for this author in PubMed Google Scholar
Nathan Van Woudenberg
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Jafari
View author publications
You can also search for this author in PubMed Google Scholar
Purang Abolmaesumi
View author publications
You can also search for this author in PubMed Google Scholar
Teresa Tsang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mobina Mahdavi .

Editor information

Editors and Affiliations

Kitware Inc., Carrboro, NC, USA
Stephen Aylward
University of Oxford, Oxford, UK
J. Alison Noble
University College London, London, UK
Yipeng Hu
University College London, London, UK
Su-Lin Lee
University College London, London, UK
Zachary Baum
University College London, London, UK
Zhe Min

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mahdavi, M. et al. (2022). Differential Learning from Sparse and Noisy Labels for Robust Detection of Clinical Landmarks in Echo Cine Series. In: Aylward, S., Noble, J.A., Hu, Y., Lee, SL., Baum, Z., Min, Z. (eds) Simplifying Medical Ultrasound. ASMUS 2022. Lecture Notes in Computer Science, vol 13565. Springer, Cham. https://doi.org/10.1007/978-3-031-16902-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-16902-1_5
Published: 15 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16901-4
Online ISBN: 978-3-031-16902-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)