Reference standard for the evaluation of automatic segmentation algorithms: Quantification of inter observer variability of manual delineation of prostate contour on MRI - 31/01/24

Doi : 10.1016/j.diii.2023.08.001

Sébastien Molière ^a,^b,^c,^{^{1
These authors contributed equally to this work},}^⁎ , Dimitri Hamzaoui ^d,^{^{1
These authors contributed equally to this work}}, Benjamin Granger ^e, Sarah Montagne ^f,^g,^h, Alexandre Allera ^g, Malek Ezziane ^g, Anna Luzurier ^g, Raphaelle Quint ^g, Mehdi Kalai ^g, Nicholas Ayache ^a, Hervé Delingette ^a, Raphaële Renard-Penna ^f,^g,^h
^a Department of Radiology, Hôpitaux Universitaire de Strasbourg, Hôpital de Hautepierre, 67200, Strasbourg, France
^b Breast and Thyroid Imaging Unit, Institut de Cancérologie Strasbourg Europe, 67200, Strasbourg, France
^c IGBMC, Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67400, Illkirch, France
^d Inria, Epione Team, Sophia Antipolis, Université Côte d'Azur, 06902, Nice, France
^e Sorbonne Université, INSERM, Institut Pierre Louis d'Epidémiologie et de Santé Publique, IPLESP, AP-HP, Hôpital Pitié Salpêtrière, Département de Santé Publique, 75013, Paris, France
^f Department of Radiology, Hôpital Tenon, Assistance Publique-Hôpitaux de Paris, 75020, Paris, France
^g Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique-Hôpitaux de Paris, 75013, Paris, France
^h GRC N° 5, Oncotype-Uro, Sorbonne Université, 75020, Paris, France

^⁎Corresponding author.

connectez-vous ou créez un compte

Bienvenue sur EM-consulte, la référence des professionnels de santé.
Article gratuit.

Connectez-vous pour en bénéficier!

Highlights

•	The number of readers affects the consistency and conformity of prostate segmentation on MRI.
•	Inter-rater consistency shows a tipping point with three readers, and this number also marks a tipping point in the evolution of consensus segmentation volume according to the number of readers.
•	Prostate segmentations exhibit maximum conformity to a reference with three readers.
•	Three readers may be an optimal number of raters to consider for references for artificial intelligence applications for prostate segmentation.

Le texte complet de cet article est disponible en PDF.

Abstract

Purpose

The purpose of this study was to investigate the relationship between inter-reader variability in manual prostate contour segmentation on magnetic resonance imaging (MRI) examinations and determine the optimal number of readers required to establish a reliable reference standard.

Materials and methods

Seven radiologists with various experiences independently performed manual segmentation of the prostate contour (whole-gland [WG] and transition zone [TZ]) on 40 prostate MRI examinations obtained in 40 patients. Inter-reader variability in prostate contour delineations was estimated using standard metrics (Dice similarity coefficient [DSC], Hausdorff distance and volume-based metrics). The impact of the number of readers (from two to seven) on segmentation variability was assessed using pairwise metrics (consistency) and metrics with respect to a reference segmentation (conformity), obtained either with majority voting or simultaneous truth and performance level estimation (STAPLE) algorithm.

Results

The average segmentation DSC for two readers in pairwise comparison was 0.919 for WG and 0.876 for TZ. Variability decreased with the number of readers: the interquartile ranges of the DSC were 0.076 (WG) / 0.021 (TZ) for configurations with two readers, 0.005 (WG) / 0.012 (TZ) for configurations with three readers, and 0.002 (WG) / 0.0037 (TZ) for configurations with six readers. The interquartile range decreased slightly faster between two and three readers than between three and six readers. When using consensus methods, variability often reached its minimum with three readers (with STAPLE, DSC = 0.96 [range: 0.945–0.971] for WG and DSC = 0.94 [range: 0.912–0.957] for TZ, and interquartile range was minimal for configurations with three readers.

Conclusion

The number of readers affects the inter-reader variability, in terms of inter-reader consistency and conformity to a reference. Variability is minimal for three readers, or three readers represent a tipping point in the variability evolution, with both pairwise-based metrics or metrics with respect to a reference. Accordingly, three readers may represent an optimal number to determine references for artificial intelligence applications.

Le texte complet de cet article est disponible en PDF.

Keywords : Artificial intelligence, Inter-reader variability, Magnetic resonance imaging, Prostate, Segmentation

Abbreviations : 3D, AI, ASSD, DSC, HD, HD95, IQR, MRI, PCa, PSA, STAPLE, TZ, WG

Plan

Variability assessment and statistical analyses

Results

Impact of the number of readers on overall segmentation variability

Evolution of segmentation volumes according to the number of readers

Discussion

Author contributions

Informed consent

CRediT authorship contribution statement

Export

Vol 105 - N° 2

P. 65-73 - février 2024 Retour au numéro

Article précédent

Intraprocedural assessment of ablation margins using computed tomography co-registration in hepatocellular carcinoma treatment with percutaneous ablation: IAMCOMPLETE study
Pim Hendriks, Kiki M van Dijk, Bas Boekestijn, Alexander Broersen, Jacoba J van Duijn-de Vreugd, Minneke J Coenraad, Maarten E Tushuizen, Arian R van Erkel, Rutger W van der Meer, Catharina SP van Rijswijk, Jouke Dijkstra, Lioe-Fee de Geus-Oei, Mark C Burgmans

| Article suivant

French community grid for the evaluation of radiological artificial intelligence solutions (DRIM France Artificial Intelligence Initiative)
Daphné Guenoun, Marc Zins, Pierre Champsaur, Isabelle Thomassin-Naggara, DRIM France AI Study Group

Bienvenue sur EM-consulte, la référence des professionnels de santé.

connectez-vous ou créez un compte

Reference standard for the evaluation of automatic segmentation algorithms: Quantification of inter observer variability of manual delineation of prostate contour on MRI - 31/01/24

Highlights

Abstract

Purpose

Materials and methods

Results

Conclusion

Plan

Export citations

Fichier

Contenu

Accès rapides

Mon compte

Aide & support

Plateformes Elsevier Masson

Déclaration CNIL