ABSTRACT
The significance of chest X-ray imaging in diagnosing chest diseases is well-established in clinical and research domains. The automation of generating X-ray reports can address various challenges associated with manual diagnosis by speeding up the report generation system, becoming the perfect assistant for radiologists, and reducing their tedious workload. But, this automation's key challenge is to accurately capture the abnormal findings and produce a fluent as well as natural report. In this paper, we introduce EfficienTransNet, an automatic chest X-ray report generation approach based on CNN-Transformers. EfficienTransNet prioritizes clinical accuracy and demonstrates improved text generation metrics. Our model incorporates clinical history or indications to enhance the report generation process and align with radiologists' workflow, which is mostly overlooked in recent research. On two publicly available X-ray report generation datasets, MIMIC-CXR, and IU X-ray, our model yields promising results on natural language evaluation and clinical accuracy metrics. Qualitative results, demonstrated with Grad-CAM, provide disease location information for radiologists' better understanding. Our proposed model emphasizes radiologists' workflow, enhancing the explainability, transparency, and trustworthiness of radiologists in the report generation process.
- Brandon Abela, Jumana Abu-Khalaf, Chi-Wei Robin Yang, Martin Masek, and Ashu Gupta. 2022. Automated Radiology Report Generation Using a Transformer-Template System: Improved Clinical Accuracy and an Assessment of Clinical Safety. In Proceedings of the Australasian Joint Conference on Artificial Intelligence. Springer, 530--543. https://doi.org/10.1007/978--3-031--22695--3_37Google ScholarDigital Library
- Omar Alfarghaly, Rana Khaled, Abeer Elkorany, Maha Helal, and Aly Fahmy. 2021. Automated radiology report generation using conditioned transformers. Informatics in Medicine Unlocked , Vol. 24 (2021), 100557. https://doi.org/10.1016/j.imu.2021.100557Google ScholarCross Ref
- Andy Brock, Soham De, Samuel L Smith, and Karen Simonyan. 2021. High-performance large-scale image recognition without normalization. In Proceedings of the International Conference on Machine Learning. PMLR, 1059--1071.Google Scholar
- Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems , Vol. 33 (2020), 1877--1901.Google Scholar
- Zhihong Chen, Yaling Shen, Yan Song, and Xiang Wan. 2022. Cross-modal memory networks for radiology report generation. In 11th International Joint Conference on Natural Language Processing. ACL. https://doi.org/10.18653/v1/2021.acl-long.459Google ScholarCross Ref
- Zhihong Chen, Yan Song, Tsung-Hui Chang, and Xiang Wan. 2020. Generating radiology reports via memory-driven transformer. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL, 1439--1449.Google ScholarCross Ref
- Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014). https://doi.org/10.48550/arXiv.1406.1078Google ScholarCross Ref
- Dina Demner-Fushman, Marc D Kohli, Marc B Rosenman, Sonya E Shooshan, Laritza Rodriguez, Sameer Antani, George R Thoma, and Clement J McDonald. 2016. Preparing a collection of radiology examinations for distribution and retrieval. Journal of the American Medical Informatics Association, Vol. 23, 2 (2016), 304--310. https://doi.org/10.1093/jamia/ocv080Google ScholarCross Ref
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google Scholar
- Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2021. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2021). https://doi.org/10.48550/arXiv.2010.11929Google ScholarCross Ref
- Gabriel Forgues, Joelle Pineau, Jean-Marie Larchevêque, and Réal Tremblay. 2014. Bootstrapping dialog systems with word embeddings. In Proceedings of the modern machine learning and natural language processing workshop, Vol. 2. 168.Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780. https://doi.org/10.1007/978--3--642--24797--2_4Google ScholarCross Ref
- Jeremy Irvin, Pranav Rajpurkar, Michael Ko, Yifan Yu, Silviana Ciurea-Ilcus, Chris Chute, Henrik Marklund, Behzad Haghgoo, Robyn Ball, Katie Shpanskaya, et al. 2019. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 590--597. https://doi.org/10.1609/aaai.v33i01.3301590Google ScholarDigital Library
- Baoyu Jing, Pengtao Xie, and Eric Xing. July 2018. On the Automatic Generation of Medical Imaging Reports. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Melbourne, Australia, 2577--2586. https://doi.org/10.18653/v1/P18--1240Google ScholarCross Ref
- Alistair EW Johnson, Tom J Pollard, Seth J Berkowitz, Nathaniel R Greenbaum, Matthew P Lungren, Chih-Ying Deng, Roger G Mark, and Steven Horng. 2019. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Scientific data, Vol. 6, 1 (2019), 1--8. https://doi.org/10.1038/s41597-019-0322-0Google ScholarCross Ref
- Ryan Kiros, Yukun Zhu, Russ R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Skip-thought vectors. Advances in neural information processing systems , Vol. 28 (2015).Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2017. Imagenet classification with deep convolutional neural networks. Commun. ACM, Vol. 60, 6 (2017), 84--90. https://doi.org/10.1145/3065386Google ScholarDigital Library
- Mingjie Li, Bingqian Lin, Zicong Chen, Haokun Lin, Xiaodan Liang, and Xiaojun Chang. 2023. Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3334--3343.Google ScholarCross Ref
- Min Lin, Qiang Chen, and Shuicheng Yan. 2013. Network in network. arXiv preprint arXiv:1312.4400 (2013).Google Scholar
- Fenglin Liu, Changchang Yin, Xian Wu, Shen Ge, Ping Zhang, and Xu Sun. 2021. Contrastive Attention for Automatic Chest X-ray Report Generation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 269--280.Google ScholarCross Ref
- Guanxiong Liu, Tzu-Ming Harry Hsu, Matthew McDermott, Willie Boag, Wei-Hung Weng, Peter Szolovits, and Marzyeh Ghassemi. 2019. Clinically accurate chest x-ray report generation. In Proceedings of the Machine Learning for Healthcare Conference. PMLR, 249--269.Google Scholar
- Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).Google Scholar
- Chayan Mondal, Md Kamrul Hasan, Mohiuddin Ahmad, Md Abdul Awal, Md Tasnim Jawad, Aishwariya Dutta, Md Rabiul Islam, and Mohammad Ali Moni. 2021. Ensemble of convolutional neural networks to diagnose acute lymphoblastic leukemia from microscopic images. IMU , Vol. 27 (2021), 100794. https://doi.org/10.1016/j.imu.2021.100794Google ScholarCross Ref
- Hoang TN Nguyen, Dong Nie, Taivanbat Badamdorj, Yujie Liu, Yingying Zhu, Jason Truong, and Li Cheng. 2021. Automated generation of accurate$backslash$& fluent medical x-ray reports. arXiv preprint arXiv:2108.12126 (2021). https://doi.org/10.48550/arXiv.2108.12126Google ScholarCross Ref
- Aaron Nicolson, Jason Dowling, and Bevan Koopman. 2022. Improving Chest X-Ray Report Generation by Leveraging Warm-Starting. arXiv preprint arXiv:2201.09405 (2022).Google Scholar
- Farhad Nooralahzadeh, Nicolas Perez Gonzalez, Thomas Frauenfelder, Koji Fujimoto, and Michael Krauthammer. 2021. Progressive transformer-based generation of radiology reports. arXiv preprint arXiv:2102.09777 (2021).Google Scholar
- Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. 2018. Improving language understanding by generative pre-training. OpenAI (2018).Google Scholar
- Pranav Rajpurkar, Jeremy Irvin, Kaylie Zhu, Brandon Yang, Hershel Mehta, Tony Duan, Daisy Ding, Aarti Bagul, Curtis Langlotz, Katie Shpanskaya, et al. 2017. Chexnet: Radiologist-level pneumonia detection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225 (2017).Google Scholar
- Andreas Rücklé , Steffen Eger, Maxime Peyrard, and Iryna Gurevych. 2018. Concatenated power mean word embeddings as universal cross-lingual sentence representations. arXiv preprint arXiv:1803.01400 (2018).Google Scholar
- Vasile Rus and Mihai Lintean. 2012. An optimal assessment of natural language student input using word-to-word similarity metrics. In Proceedings of the International Conference on Intelligent Tutoring Systems. Springer, 675--676. https://doi.org/10.1007/978--3--642--30950--2_116Google ScholarDigital Library
- Hoo-Chang Shin, Kirk Roberts, Le Lu, Dina Demner-Fushman, Jianhua Yao, and Ronald M Summers. 2016. Learning to read chest x-rays: Recurrent neural cascade model for automated image annotation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2497--2506.Google ScholarCross Ref
- Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1--9.Google ScholarCross Ref
- Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International conference on machine learning. PMLR, 6105--6114.Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems , Vol. 30 (2017).Google Scholar
- Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, and Ronald M Summers. 2017. Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2097--2106.Google ScholarCross Ref
- Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, and Ronald M Summers. 2018. Tienet: Text-image embedding network for common thorax disease classification and reporting in chest x-rays. In Proceedings of the IEEE conference on computer vision and pattern recognition. 9049--9058.Google ScholarCross Ref
- Yuan Xue, Tao Xu, L Rodney Long, Zhiyun Xue, Sameer Antani, George R Thoma, and Xiaolei Huang. 2018. Multimodal recurrent model with attention for automated radiology report generation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 457--466. https://doi.org/10.1007/978--3-030-00928--1_52Google ScholarDigital Library
- Jianbo Yuan, Haofu Liao, Rui Luo, and Jiebo Luo. 2019. Automatic radiology report generation based on multi-view image fusion and medical concept enrichment. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 721--729. https://doi.org/10.1007/978--3-030--32226--7_80 ioGoogle ScholarDigital Library
Index Terms
- EfficienTransNet: An Automated Chest X-ray Report Generation Paradigm
Recommendations
Ensemble Stack Architecture for Lungs Segmentation from X-ray Images
Intelligent Data Engineering and Automated Learning – IDEAL 2022AbstractIn healthcare, chest X-rays are an inexpensive medical imaging diagnostic tools. The lung images segmentation from chest X-rays (CXRs) is important for screening and diagnosing diseases. The lungs are opacified in many patients’ CXRs, making it ...
An Efficient Variant of Fully-Convolutional Network for Segmenting Lung Fields from Chest Radiographs
Automatic analysis of chest radiographs using computer-aided diagnosis (CAD) systems is pivotal to perform mass screening and detect early signs of various abnormalities in patients. In a chest radiographic CAD system, segmentation of lung fields is a ...
Automated brain tumour segmentation techniques- A review
Automatic segmentation of brain tumour is the process of separating abnormal tissues from normal tissues, such as white matter WM, gray matter GM, and cerebrospinal fluid CSF. The process of segmentation is still challenging due to the diversity of ...
Comments