A benchmark of text embedding models for semantic harmonization of Alzheimer's disease cohorts - 01/12/25

Doi : 10.1016/j.tjpad.2025.100420

Tim Adams ^a,^{^{1
Equal contribution.}}, Yasamin Salimi ^a,^{^{1
Equal contribution.}}, Mehmet Can Ay ^a, Diego Valderrama ^a, Marc Jacobs ^a, Holger Fröhlich ^a,^b,^c,^⁎

^a Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt, Augustin, 53757, Germany

^b Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany

^c Institute for Digital Medicine, University Hospital Bonn, Bonn, Germany

^⁎ Corresponding author at: Schloß Birlinghoven, Sankt, Augustin, 53757, Germany. Schloß Birlinghoven Sankt Augustin 53757 Germany

connectez-vous ou créez un compte

Bienvenue sur EM-consulte, la référence des professionnels de santé.
Article gratuit.

Connectez-vous pour en bénéficier!

Abstract

Background

Harmonizing diverse healthcare datasets is a challenging task due to inconsistent naming conventions. Manual harmonization is time- and resource-intensive, limiting scalability for multi-cohort Alzheimer's Disease research. Large Language Models, or specifically text-embedding models, offer a promising solution, but their rapid development necessitates continuous, domain-specific benchmarking, especially since general established benchmarks lack clinical data harmonization use cases.

Objectives

To evaluate how different text-embedding models perform for the harmonization of clinical variables.

Design and setting

We created a novel benchmark to assess how well different Language Model embeddings can be used to harmonize cohort study metadata with an in-house Common Data Model that includes cohort-to-cohort mappings for a wide range of Alzheimer’s Disease cohorts. We evaluated five different state-of-the-art text embedding models for seven different data sets in the context of Alzheimer’s disease.

Participants

No patient data were utilized for any of the analyses, as the evaluation was based on semantic harmonization of cohort metadata only.

Measurements

Text descriptions of variables from different modalities were included for the analyses, namely clinical, lifestyle, demographics, and imaging.

Results

Our benchmark results favored different models compared to general-purpose benchmarks. This suggests that models fine-tuned for generic tasks may not translate well to real-world data harmonization, particularly in Alzheimer’s disease. We propose guidelines to format metadata to facilitate manual or model-assisted data harmonization. We introduce an open-source library ( ADHTEB ) and an interactive leaderboard ( adhteb.scai.fraunhofer.de ) to aid future model benchmarking.

Conclusions

Our findings highlight the importance of domain-specific benchmarks for clinical data harmonization in the field of Alzheimer’s disease and motivate standards for naming conventions that may support semi-automated mapping applications in the future.

Le texte complet de cet article est disponible en PDF.

Keywords : Harmonization, Alzheimer’s disease, Text-embeddings, Large language models

Plan

Large language model-based variable embeddings

Results

Discussion

Limitations and future work

Declaration of generative AI and AI-assisted technologies in the writing process

CRediT authorship contribution statement

Export

Vol 13 - N° 1

Article 100420- janvier 2026 Retour au numéro

Article précédent

Artificial intelligence and the acceleration of Alzheimer’s research - From promise to practice
Gregory J. Moore, Niranjan Bose, Husseini K. Manji, Eric M. Reiman, Reisa Sperling

| Article suivant

Towards an AI biomedical scientist: Accelerating discoveries in neurodegenerative disease
Kaleigh F. Roberts, Eric C. Landsness, Justin Reese, Donald Elbert, Gabrielle Strobel, Elizabeth Wu, Yixin Chen, Albert Lai, Zachary B. Abrams, Mingfang Zhu, Justin Melendez, Srinivas Koutarapu, Sihui Song, Yun Chen, Robert Lazar, Payam Barnaghi, John F. Crary, Sergio Pablo Sardi, Marc D. Voss, Rajaraman Krishnan, Joel W. Schwartz, Ron Mallon, Gustavo A. Jimenez-Maggiora, Chenguang Wang, Thomas Sandmann, Niranjan Bose, Mukta Phatak, Gayle Wittenberg, Yannis G. Kevrekidis, Cassie S. Mitchell, Ludovico Mitchener, Towfique Raj, Luca Foschini, Gregory J. Moore, Randall J. Bateman

Bienvenue sur EM-consulte, la référence des professionnels de santé.

connectez-vous ou créez un compte

A benchmark of text embedding models for semantic harmonization of Alzheimer's disease cohorts - 01/12/25

Abstract

Background

Objectives

Design and setting

Participants

Measurements

Results

Conclusions

Plan

Export citations

Fichier

Contenu

Accès rapides

Mon compte

Aide & support

Plateformes Elsevier Masson

Déclaration CNIL