abstract

HealAI: A Healthcare LLM for Effective Medical Documentation

Authors:

Sagar Goyal,

Eti Rastogi,

Sree Prasanna Rajagopal,

Jeff WardAuthors Info & Claims

WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data Mining

Pages 1167 - 1168

https://doi.org/10.1145/3616855.3635739

Published: 04 March 2024 Publication History

Get Access

Abstract

Since the advent of LLM's like GPT4 everyone in various industries has been trying to harness their power. Healthcare is an industry where this is a specifically challenging problem due to the high accuracy requirements. Prompt Engineering is a common technique used to design instructions for model responses, however, its challenges lie in the fact that the generic models may not be trained to accurately execute these specific tasks. We will present our journey of developing a cost-effective medical LLM, surpassing GPT4 in medical note-writing tasks. We'll touch upon our trials with medical prompt engineering, GPT4's limitations, and training an optimized LLM for specific medical tasks. We'll showcase multiple comparisons on model sizes, training data, and pipeline designs that enabled us to outperform GPT4 with smaller models, maintaining precision, reducing biases, preventing hallucinations, and enhancing note-writing style.

References

[1]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459--9474. https://doi.org/10.48550/arXiv.2005.11401

Crossref

Google Scholar

[2]

Eti Rastogi. 2023. Overcoming Hallucinations and Biases in LLM: A Step Towards Reliable Medical Application. (2023). https://www.deepscribe.ai/resources/overcoming-hallucinations-and-biases-in-llm-a-step-towards-reliable-medical-applications?utm_content=260768543&utm_medium=social&utm_source=linkedin&hss_channel=lcp-19018424

Google Scholar

[3]

Azizi S. Tu T. et al. Singhal, K. 2023. Large language models encode clinical knowledge. (2023). https://doi.org/10.1038/s41586-023-06291--2

Crossref

Google Scholar

[4]

Sassan Ghassemzadeh Vivek Podder, Valerie Lew. 2022. SOAP Notes. StatPearls Publishing, Treasure Island (FL).

Google Scholar

[5]

Boxin Wang, Wei Ping, Lawrence McAfee, Peng Xu, Bo Li, Mohammad Shoeybi, and Bryan Catanzaro. 2023. InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining. arXiv preprint arXiv:2310.07713 (2023). https://doi.org/10.48550/arXiv.2310.07713

Crossref

Google Scholar

[6]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35 (2022), 24824--24837. https://doi.org/10.48550/arXiv.2201.11903

Crossref

Google Scholar

[7]

Peng Xu, Wei Ping, Xianchao Wu, Lawrence McAfee, Chen Zhu, Zihan Liu, Sandeep Subramanian, Evelina Bakhturina, Mohammad Shoeybi, and Bryan Catanzaro. 2023. Retrieval meets Long Context Large Language Models. arXiv preprint arXiv:2310.03025 (2023). https://doi.org/10.48550/arXiv.2310.03025

Crossref

Google Scholar

[8]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2022. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629 (2022). https://doi.org/10.48550/arXiv.2210.03629

Crossref

Google Scholar

[9]

Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, et al. 2023. Lima: Less is more for alignment. arXiv preprint arXiv:2305.11206 (2023). https://doi.org/10.48550/arXiv.2305.11206

Crossref

Google Scholar

Cited By

View all

Rastogi EGoyal SZhao FYuan DNejdl WAuer SKarras OCha MMoens MNajork M(2025)SpecialtyScribe: Enhancing SOAP note Scribing for Medical Specialties using LLMsProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3706131(1098-1099)Online publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1145/3701551.3706131
Kang JRyu HSim J(2025)PRISM-Med: Parameter-Efficient Robust Interdomain Specialty Model for Medical Language TasksIEEE Access10.1109/ACCESS.2024.352504113(4957-4965)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2024.3525041
Liu YCao XChen TJiang YYou JWu MWang XFeng MJin YChen J(2025)From screens to scenes: A survey of embodied AI in healthcareInformation Fusion10.1016/j.inffus.2025.103033119(103033)Online publication date: Jul-2025
https://doi.org/10.1016/j.inffus.2025.103033
Show More Cited By

Index Terms

HealAI: A Healthcare LLM for Effective Medical Documentation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

SpecialtyScribe: Enhancing SOAP note Scribing for Medical Specialties using LLMs
WSDM '25: Proceedings of the Eighteenth ACM International Conference on Web Search and Data Mining

The healthcare industry has accumulated vast amounts of unstructured clinical data, including medical records, patient communications, and visit notes. Clinician-patient conversations are central to medical records, with the clinician's final summary (...
Blockchain-Enabled Electronic Health Records for Healthcare 4.0

Healthcare delivery is on the verge of a fundamental shift into the new era of smart and connected health care, termed Health Care 4.0. Sharing healthcare data is an important step in improving the healthcare system's intelligence and service quality. ...
Cascading Workflow of Healthcare Services: Transforming COPD Related Clinical Narratives from Discharge Summaries Into a Standardized Order Set

Despite rapid advancements in technology, the healthcare industry is known to lag behind when it comes to adopting new changes. Most often, when a new technology such as CPOE or EHR systems presents themselves in the healthcare industry, clinicians are ...

Comments

Information & Contributors

Information

Published In

WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data Mining

March 2024

1246 pages

ISBN:9798400703713

DOI:10.1145/3616855

General Chairs:
Luz Angélica
Caudillo Mata (MDA Geointelligence)
,
Silvio Lattanzi
Google Research
,
Andrés Muñoz Medina
Google Research
,
Program Chairs:
Leman Akoglu
CMU
,
Aristides Gionis
KTH
,
Sergei Vassilvitskii
Google Research

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 March 2024

Check for updates

Author Tags

Qualifiers

Abstract

Conference

WSDM '24

Sponsor:

WSDM '24: The 17th ACM International Conference on Web Search and Data Mining

March 4 - 8, 2024

Merida, Mexico

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
876
Total Downloads

Downloads (Last 12 months)876
Downloads (Last 6 weeks)118

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Rastogi EGoyal SZhao FYuan DNejdl WAuer SKarras OCha MMoens MNajork M(2025)SpecialtyScribe: Enhancing SOAP note Scribing for Medical Specialties using LLMsProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3706131(1098-1099)Online publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1145/3701551.3706131
Kang JRyu HSim J(2025)PRISM-Med: Parameter-Efficient Robust Interdomain Specialty Model for Medical Language TasksIEEE Access10.1109/ACCESS.2024.352504113(4957-4965)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2024.3525041
Liu YCao XChen TJiang YYou JWu MWang XFeng MJin YChen J(2025)From screens to scenes: A survey of embodied AI in healthcareInformation Fusion10.1016/j.inffus.2025.103033119(103033)Online publication date: Jul-2025
https://doi.org/10.1016/j.inffus.2025.103033
He KMao RLin QRuan YLan XFeng MCambria E(2025)A survey of large language models for healthcare: from data, technology, and applications to accountability and ethicsInformation Fusion10.1016/j.inffus.2025.102963118(102963)Online publication date: Jun-2025
https://doi.org/10.1016/j.inffus.2025.102963
Oewel BAzizan NArean PAgapie E(2024)Technology's Role in Fostering Therapist-Client Collaboration and Engagement with GoalsProceedings of the ACM on Human-Computer Interaction10.1145/36870558:CSCW2(1-28)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.1145/3687055
Thool ABrown C(2024)Harnessing the Power of LLMs: LLM Summarization for Human-Centric DAST Reports2024 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)10.1109/VL/HCC60511.2024.00014(33-39)Online publication date: 2-Sep-2024
https://doi.org/10.1109/VL/HCC60511.2024.00014
Wang YFu TXu YMa ZXu HDu BLu YGao HWu JChen J(undefined)TWIN-GPT: Digital Twins for Clinical Trials via Large Language ModelACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3674838
https://dl.acm.org/doi/10.1145/3674838

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

SpecialtyScribe: Enhancing SOAP note Scribing for Medical Specialties using LLMs

Blockchain-Enabled Electronic Health Records for Healthcare 4.0

Cascading Workflow of Healthcare Services: Transforming COPD Related Clinical Narratives from Discharge Summaries Into a Standardized Order Set

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations