P54 - Harnessing the versatility of large language models in oncology research: from data extraction to clinical information retrieval - 12/05/25
Exploiter la polyvalence des grands modèles de langage en recherche oncologique: de l'extraction de données à la récupération d'informations cliniques
Résumé |
Background and objective(s) |
Large Language Models (LLMs) are emerging as powerful tools in healthcare data processing, yet their application in clinical oncology remains underexplored. This study evaluated LLM's versatility across different oncology-related tasks to assess its potential for clinical data extraction and structuring, while identifying optimal use cases and improvement strategies. Our objectives were to assess the performance of LLMs in data structuring and retrieval, and patients's eligibility identification. At Centre Antoine Lacassagne (CAL), Nice, we've implemented Mixtral and we present four specific use cases and identify potential areas for improvement.
Material and Methods |
All scripts were written in python and constituted in two steps. First step was to select eligible documents to look for the information before passing those documents to be read in the second step by Mixtral 8 × 7B where prompts were designed to look for the specific information or task to be performed. Project 1 (P1): external breast biopsy reports text were extracted from PDF files using OCR. Then multiple prompts and loops were organized to automatically structure data from the texts, using rules to allow results only when Mixtral was extracting the expected data (e.g: 0,1+,2+,3+ for HER2 status). Project 2 (P2): selection using ICD10 codes of patients with pulmonary disease (tumor or metastasis) with radiotherapy reports and then using Mixtral to identify patients who underwent pulmonary specific cyberknife radiotherapy. Project 3 (P3): first selecting patients consultations reports and then using Mixtral to code ICD10 metastatic codes. Project 4 (P4): on breast cancer reports already structured by CAL's Natural Language Processing (NLP) method (RUBY) in order to identify HER2 positive patients, using Mixtral to capture HER2 status that were not structured. P1, P2 and P3 were compared to manually structured data as a gold standard. P4 was evaluated based on the number of data retrieved.
Results |
The model demonstrated varying degrees of effectiveness across applications. In P1 external breast biopsy reports structuring achieved precision scores ranging from 0.60 to 0.80. In P2, precision, recall and F1-score, were at 0.77 with Mixtral struggling with abbreviation used for specific lung localization (e.g: ml for middle lobe). In P3; metastatic disease detection (i.e: Metastastis: Yes or No) showed robust performance with a precision of 0.88, while metastatic site localization reached 0.83 precision, ICD10 codes generated were not to be trusted and evaluation was performed on the written localization generated by Mixtral. P4: RUBY method identified 582 HER2 positive patients, the use of Mixtral allowed to detect 33 additional previously unstructured data.
Conclusion |
This study demonstrates the versatility of LLMs in clinical oncology applications while highlighting the importance of careful use-case selection. The varying precision scores across different tasks suggest that LLMs' effectiveness is context-dependent but also linked to the inherent limitation of Mixtral in very specific area. Future performance improvements may be achieved through enhanced prompt engineering and the implementation of retrieval-augmented generation (RAG), text-augmented generation (TAG), and knowledge-augmented generation (KAG) techniques.
Le texte complet de cet article est disponible en PDF.Keywords : LLM, artificial intelligence, Clinical research, Oncology, Mixtral
Vol 73 - N° S2
Article 203085- mai 2025 Retour au numéroBienvenue sur EM-consulte, la référence des professionnels de santé.
L’accès au texte intégral de cet article nécessite un abonnement.
Déjà abonné à cette revue ?

