Creating a standardized tool for the evaluation and comparison of artificial intelligence–based computer-aided detection programs in colonoscopy: a modified Delphi approach - 14/06/25

Doi : 10.1016/j.gie.2024.11.042

Sanjay R.V. Gadi, MD ^1,^⁎ , Yuichi Mori, MD, PhD ^2,^3,⁴, Masashi Misawa, MD, PhD ², James E. East, MD ^5,⁶, Cesare Hassan, MD, PhD ^7,⁸, Alessandro Repici, MD ^7,⁸, Michael F. Byrne, MD ^9,¹⁰, Daniel von Renteln, MD ¹¹, David G. Hewett, MBBS, PhD, MSc ¹², Pu Wang, MD ¹³, Yutaka Saito, MD, PhD ¹⁴, Carolina Ogawa Matsubayashi, MD ^15,¹⁶, Omer F. Ahmad, MBBS ¹⁷, Prateek Sharma, MBBS ¹⁸, Seth A. Gross, MD ¹⁹, Neil Sengupta, MD ²⁰, Nabil Mansour, MD ²¹, Andrea Cherubini, PhD ²², Nhan Ngo Dinh ²², Xiao Xiao, PhD ²³, Peter Mountney, PhD ^24,²⁵, Juana González-Bueno Puyal, PhD ^24,²⁵, Greg Little, MBA ²⁵, Shawn LaRocco, MBA ²⁵, Sailesh Conjeti, PhD ²⁵, Hannes Seibt, MS ²⁶, Dror Zur, PhD ²⁷, Hitoshi Shimada, BEE ²⁸, Tyler M. Berzin, MD ^{^{∗
Drs Berzin and Glissen Brown contributed equally as senior authors.},}²⁹, Jeremy R. Glissen Brown, MD, MSc ^{^{∗
Drs Berzin and Glissen Brown contributed equally as senior authors.},}³⁰
¹ Department of Medicine, Duke University School of Medicine, Durham, North Carolina, USA
² Digestive Disease Center, Showa University Northern Yokohama Hospital, Yokohama, Japan
³ Clinical Effectiveness Research Group, Institute of Health and Society, University of Oslo, Oslo, Norway
⁴ Department of Transplantation Medicine, Oslo University Hospital, Oslo, Norway
⁵ Translational Gastroenterology Unit, John Radcliffe Hospital, Oxford, United Kingdom
⁶ Oxford NIHR Biomedical Research Centre, University of Oxford, Oxford, United Kingdom
⁷ Department of Biomedical Sciences Humanitas University, Pieve Emanuele, Milan, Italy
⁸ IRCCS Humanitas Research Hospital, Rozzano, Milan, Italy
⁹ Division of Gastroenterology, Vancouver General Hospital, University of British Columbia, Vancouver, British Columbia, Canada
¹⁰ Satisfai Health, Vancouver, British Columbia, Canada
¹¹ Division of Gastroenterology, Montréal University Hospital and Research Center, Montréal, Québec, Canada
¹² School of Medicine, The University of Queensland, Brisbane, Queensland, Australia
¹³ Sichuan Academy of Medical Sciences and Sichuan Provincial People’s Hospital, Chengdu, China
¹⁴ Endoscopy Division, National Cancer Center Hospital, Tokyo, Japan
¹⁵ Gastrointestinal Endoscopy Unit, Gastroenterology Department, University of São Paulo Medical School, São Paulo, Brazil
¹⁶ AI Medical Services Inc., Tokyo, Japan
¹⁷ Wellcome/EPSRC Centre for Interventional & Surgical Sciences, University College London, London, United Kingdom
¹⁸ Division of Gastroenterology and Hepatology, University of Kansas School of Medicine and VA Medical Center, Kansas City, Kansas, USA
¹⁹ Division of Gastroenterology and Hepatology, New York University Langone Health System, New York, New York, USA
²⁰ Section of Gastroenterology, University of Chicago Medicine, Chicago, Illinois, USA
²¹ Section of Gastroenterology and Hepatology, Baylor College of Medicine, Houston, Texas, USA
²² Cosmo Intelligent Medical Devices, Dublin, Ireland
²³ Wision AI, Palo Alto, California, USA
²⁴ Odin Vision, London, United Kingdom
²⁵ Olympus Corporation, Tokyo, Japan
²⁶ Pentax Medical Europe, Hamburg, Germany
²⁷ Magentiq Eye, Haifa, Israel
²⁸ FUJIFILM Healthcare Americas Corporation, Lexington, Massachusetts, USA
²⁹ Center for Advanced Endoscopy, Beth Israel Deaconess Medical Center and Harvard Medical School, Boston, Massachusetts, USA
³⁰ Division of Gastroenterology, Duke University Medical Center, Durham, North Carolina, USA

^∗Reprint requests: Sanjay R. V. Gadi, MD, Department of Medicine, Duke University School of Medicine, 2301 Erwin Rd, Durham, NC 22710.Department of MedicineDuke University School of Medicine2301 Erwin RdDurhamNC22710

Abstract

Background and Aims

Multiple computer-aided detection (CADe) software programs have now achieved regulatory approval in the United States, Europe, and Asia and are being used in routine clinical practice to support colorectal cancer screening. There is uncertainty regarding how different CADe algorithms may perform. No objective methodology exists for comparing different algorithms. We aimed to identify priority scoring metrics for CADe evaluation and comparison.

Methods

A modified Delphi approach was used. Twenty-five global leaders in CADe in colonoscopy, including endoscopists, researchers, and industry representatives, participated in an online survey over the course of 8 months. Participants generated 121 scoring criteria, 54 of which were deemed within the study scope and distributed for review and asynchronous e-mail–based open comment. Participants then scored criteria in order of priority on a 5-point Likert scale during ranking round 1. The top 11 highest priority criteria were re-distributed, with another opportunity for open comment, followed by a final round of priority scoring to identify the final 6 criteria.

Results

Mean priority scores for the 54 criteria ranged from 2.25 to 4.38 after the first ranking round. The top 11 criteria after round 1 of ranking yielded mean priority scores ranging from 3.04 to 4.16. The final 6 highest priority criteria, including a tie for first-place ranking, were (1, tied) sensitivity (average, 4.16) and (1, tied) separate and independent validation of the CADe algorithm (average, 4.16); (3) adenoma detection rate (average, 4.08); (4) false-positive rate (average, 4.00); (5) latency (average, 3.84); and (6) adenoma miss rate (average, 3.68).

Conclusions

This is the first reported international consensus statement of priority scoring metrics for CADe in colonoscopy. These scoring criteria should inform CADe software development and refinement. Future research should validate these metrics on a benchmark video dataset to develop a validated scoring instrument.

Le texte complet de cet article est disponible en PDF.

Abbreviations : ADR, AI, AMR, CADe, CRC, dBox, GTBox, IoU, LIS, NLIS, SSL

Plan

Methods

Study design

Expert participants

Phase 1: criteria generation

Phase 2: criteria definitions and initial ranking

Phase 3: final ranking

Results

Discussion

Final 6 criteria

Remaining top 11 criteria

Defining statistical criteria

DIVERSITY, EQUITY, AND INCLUSION: One or more of the authors of this paper self-identifies as an under-represented gender minority in science. One or more of the authors of this paper self-identifies as an under-represented ethnic minority in science. The author list of this paper includes contributors from the location where the research was conducted who participated in the data collection, design, analysis, and/or interpretation of the work.

Export

Vol 102 - N° 1

P. 109 - juillet 2025 Retour au numéro

Article précédent

Endoscopic retrograde cholangiopancreatography—related duodenal perforations in the modern era of advanced endoscopy
Scot M. Lewey

| Article suivant

British Society of Gastroenterology national evaluation of colonoscopy quality: findings from the National Endoscopy Database
David Beaton, Linda Sharp, Nigel Trudgill, Mo Thoufeeq, Brian D. Nicholson, Peter Rogers, Allan John Morris, Matthew Rutter

Bienvenue sur EM-consulte, la référence des professionnels de santé.
L’accès au texte intégral de cet article nécessite un abonnement.

Déjà abonné à cette revue ?

connectez-vous ou créez un compte

Creating a standardized tool for the evaluation and comparison of artificial intelligence–based computer-aided detection programs in colonoscopy: a modified Delphi approach - 14/06/25

Abstract

Background and Aims

Methods

Results

Conclusions

Plan

Export citations

Fichier

Contenu

Accès rapides

Mon compte

Aide & support

Plateformes Elsevier Masson

Déclaration CNIL