Comparative Analysis of Large Language Models in the Interpretation of Gynecologic Pathology Reports

Authors

  • Aslı Karakaşlı Erol Olçok Training and Research Hospital, Department of Obstetrics and Gynecology, Çorum, Türkiye Author

DOI:

https://doi.org/10.65495/eurjimr.2026.14

Keywords:

Large Language Models, Patient Education, AI Empathy, Gynecologic Pathology

Abstract

This letter evaluates the performance of three large language models in explaining gynecologic pathology reports to patients. Using synthetic cases ranging from benign to malignant diagnoses, the author compares readability, emotional tone, and medical jargon density, highlighting clinically relevant differences in patient-centered communication styles.

References

1. Steimetz E, Minkowitz J, Gabutan EC, et al. Use of Artificial Intelligence Chatbots in Interpretation of Pathology Reports. JAMA Netw Open. May 22 2024;7(5):10. e2412767. doi:10.1001/jamanetworkopen.2024.12767

2. Beale SK, Cohen N, Secheli B, McIntire D, Kho KA. Comparing physician and artificial intelligence chatbot responses to posthysterectomy questions posted to a public social media forum. AJOG Glob Rep. Aug 2025;5(3):11. 100553. doi:10.1016/j.xagr.2025.100553

3. Anastasio MK, Peters P, Foote J, et al. The doc versus the bot: A pilot study to assess the quality and accuracy of physician and chatbot responses to clinical questions in gynecologic oncology. Gynecol Oncol Rep. Oct 2024;55:4. 101477. doi:10.1016/j.gore.2024.101477

4. Kowalski JT, Brechtel L. Review of chatbots in urogynecology. Curr Opin Obstet Gynecol. Dec 2025;37(6):421-425. doi:10.1097/gco.0000000000001067

5. Recker F, Neubauer R, Wittek A, Scholten N. Large language models and women's health: a digital companion for informed decision-making. Editorial Material. Arch Gynecol Obstet. Sep 2025;312(3):663-670. doi:10.1007/s00404-025-08065-9

6. Cohen ND, Ho M, McIntire D, Smith K, Kho KA. A comparative analysis of generative artificial intelligence responses from leading chatbots to questions about endometriosis. AJOG Glob Rep. Feb 2025;5(1):7. 100405. doi:10.1016/j.xagr.2024.100405

7. Ateşman E. Türkçede Okunabilirliğin Ölçülmesi. Measuring readability in Turkish. Dil Dergisi. 1997;(58):71-74.

8. Mohammad SM, Turney PD. NRC emotion lexicon. National Research Council of Canada. Record identifier / Identificateur de l’enregistrement : 0b6a5b58-a656-49d3-ab3e-252050a7a88c, Collection / Collection : NRC Publications Archive / Archives des publications du CNRC. Updated 2013/11/15. https://nrc-publications.canada.ca/eng/view/object/?id=0b6a5b58-a656-49d3-ab3e-252050a7a88c

9. Jockers ML. Syuzhet: Extract Sentiment and Plot Arcs from Text. Accessed 21 November, 2025. https://github.com/mjockers/syuzhet

Downloads

Published

05.01.2026

Data Availability Statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. All content in this article is original, created by the author. No third-party material (figures, tables, or text excerpts) requiring permission was utilized in this manuscript.

Issue

Section

Letter to the Editor

Categories

How to Cite

1.
Karakaşlı A. Comparative Analysis of Large Language Models in the Interpretation of Gynecologic Pathology Reports. Eur J Innov Med Res. 2026;1(1):31-33. doi:10.65495/eurjimr.2026.14