Using Large Language Models to Generate Educational Materials on Childhood Glaucoma - 21/08/24

Doi : 10.1016/j.ajo.2024.04.004

Qais Dihan ^1,², Muhammad Z. Chauhan ², Taher K. Eleiwa ³, Amr K. Hassan ⁴, Ahmed B. Sallam ^2,⁵, Albert S. Khouri ⁶, Ta C. Chang ⁷, Abdelrahman M. Elhusseiny ^2,^8,^⁎
¹ Chicago Medical School (Q.D.), Rosalind Franklin University of Medicine and Science, North Chicago, Illinois, USA
² Department of Ophthalmology (Q.D., M.Z.C., A.B.S., A.M.E.), Harvey and Bernice Jones Eye Institute, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
³ Department of Ophthalmology (T.K.E.), Benha Faculty of Medicine, Benha University, Benha, Egypt
⁴ Department of Ophthalmology (A.K.H.), Faculty of Medicine, South Valley University, Qena, Egypt
⁵ Department of Ophthalmology (A.B.S.), Faculty of Medicine, Ain Shams University, Cairo, Egypt
⁶ Institute of Ophthalmology & Visual Science (A.S.K.), Rutgers New Jersey Medical School, Newark, New Jersey, USA
⁷ Department of Ophthalmology (T.C.C.), Bascom Palmer Eye Institute, University of Miami Miller School of Medicine, Miami, Florida, USA
⁸ Department of Ophthalmology (A.M.E.), Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA

^⁎Inquiries to Abdelrahman M. Elhusseiny, Harvey and Bernice Jones Eye Institute, The University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA.Harvey and Bernice Jones Eye InstituteThe University of Arkansas for Medical SciencesLittle RockArkansasUSA

Resumen

PURPOSE

To evaluate the quality, readability, and accuracy of large language model (LLM)–generated patient education materials (PEMs) on childhood glaucoma, and their ability to improve existing the readability of online information.

DESIGN

Cross-sectional comparative study.

METHODS

We evaluated responses of ChatGPT-3.5, ChatGPT-4, and Bard to 3 separate prompts requesting that they write PEMs on “childhood glaucoma.” Prompt A required PEMs be “easily understandable by the average American.” Prompt B required that PEMs be written “at a 6th-grade level using Simple Measure of Gobbledygook (SMOG) readability formula.” We then compared responses’ quality (DISCERN questionnaire, Patient Education Materials Assessment Tool [PEMAT]), readability (SMOG, Flesch–Kincaid Grade Level [FKGL]), and accuracy (Likert Misinformation scale). To assess the improvement of readability for existing online information, Prompt C requested that LLM rewrite 20 resources from a Google search of keyword “childhood glaucoma” to the American Medical Association–recommended “6th-grade level.” Rewrites were compared on key metrics such as readability, complex words (≥3 syllables), and sentence count.

RESULTS

All 3 LLMs generated PEMs that were of high quality, understandability, and accuracy (DISCERN ≥4, ≥70% PEMAT understandability, Misinformation score = 1). Prompt B responses were more readable than Prompt A responses for all 3 LLM (P ≤ .001). ChatGPT-4 generated the most readable PEMs compared to ChatGPT-3.5 and Bard (P ≤ .001). Although Prompt C responses showed consistent reduction of mean SMOG and FKGL scores, only ChatGPT-4 achieved the specified 6th-grade reading level (4.8 ± 0.8 and 3.7 ± 1.9, respectively).

CONCLUSIONS

LLMs can serve as strong supplemental tools in generating high-quality, accurate, and novel PEMs, and improving the readability of existing PEMs on childhood glaucoma.

El texto completo de este artículo está disponible en PDF.

Esquema

METHODS

ASSESSING THE READABILITY OF A TEXT

CHOOSING LARGE LANGUAGE MODELS FOR COMPARISON

CREATING PATIENT EDUCATION HANDOUTS WITH ARTIFICIAL INTELLIGENCE

SOURCE SELECTION OF EXISTING PATIENT-TARGETED EDUCATIONAL MATERIALS

USING LARGE LANGUAGE MODELS TO TRANSFORM EXISTING HEALTH EDUCATION MATERIALS

ASSESSING THE QUALITY, UNDERSTANDABILITY, ACTIONABILITY, AND ACCURACY OF LARGE LANGUAGE MODEL–GENERATED PATIENT EDUCATION HANDOUTS

STATISTICAL ANALYSIS AND TOTAL HANDOUTS GENERATED

RESULTS

DISCUSSION

CRediT authorship contribution statement

Supplemental Material available at AJO.com.

Exportación

Vol 265

P. 28-38 - septembre 2024 Regresar al número

Artículo precedente

Biomechanics Explains Variability of Response of Small Hypertropia to Graded Vertical Rectus Tenotomy
Chang Zoo Kim, Seongjin Lim, Joseph L. Demer

| Artículo siguiente

Impact of GLP-1 Agonists and SGLT-2 Inhibitors on Diabetic Retinopathy Progression: An Aggregated Electronic Health Record Data Study
Karen M. Wai, Kapil Mishra, Euna Koo, Cassie Ann Ludwig, Ravi Parikh, Prithvi Mruthyunjaya, Ehsan Rahimy

Bienvenido a EM-consulte, la referencia de los profesionales de la salud.
El acceso al texto completo de este artículo requiere una suscripción.

¿Ya suscrito a @@106933@@ revista ?

conectar o crear una cuenta

Using Large Language Models to Generate Educational Materials on Childhood Glaucoma - 21/08/24

Resumen

PURPOSE

DESIGN

METHODS

RESULTS

CONCLUSIONS

Esquema

Exportación citas

Fichero

Contenido

Mi cuenta

Aide & support

Declaración CNIL