JISARA

Journal of Information Systems Applied Research and Analytics

Volume 18

V18 N1 Pages 17-31

Apr 2025


Clinical Text Summarization using NLP Pretrained Language Models: A Case Study of MIMIC-IV-Notes


Oluwatomisin Arokodare
Georgia Southern University
Atlanta, GA USA

Hayden Wimmer
Georgia Southern University
Atlanta, GA USA

Jie Du
Grand Valley State University
Allendale, MI USA

Abstract: As the amount of data available in the health sector continues to grow in the era of information overload, it becomes increasingly crucial than ever to communicate essential information concisely. The vast amount of textual data from electronic health records can overwhelm healthcare professionals, reducing the time they can dedicate to patient care. A key challenge is creating comprehensive medical history summaries during patient admissions which integrate various documents including the history of present illness, discharge condition and medications, and discharge instructions. The need to address this challenge is urgent, as effective summarization of health records can greatly improve patient outcomes, enhance clinical decision-making, and facilitate access to knowledge. This study highlights the utilization of large language models trained to produce concise summaries through machine learning and national language processing algorithms. These models offer a promising avenue for summarizing patients' primary health concerns from daily progress notes, thereby streamlining information in hospital settings concisely and aiding diagnostic processes. In this study, we utilized pre-trained transformer models, including Bart, T5, and Pegasus, to summarize patient medical histories. We evaluated the performance of those models using metrics including BLEU, ROUGE, and BERT scores on de-identified clinical notes from MIMIC-IV. Our experimental results show that Bart and Pegasus models performed efficiently among the three large language models. The combination of these three models produced the most efficient summaries for clinical notes, given that the summary length generated by the model was shorter than the original medical history text for each medical case.

Download this article: JISARA - V18 N1 Page 17.pdf


Recommended Citation: Arokodare, O., Wimmer, H., Du, J., (2025). Clinical Text Summarization using NLP Pretrained Language Models: A Case Study of MIMIC-IV-Notes. Journal of Information Systems Applied Research and Analytics 18(1) pp 17-31. https://doi.org/10.62273/NAKA3054