S'abonner

Machine learning in forensic toxicology: Applications, experiences, and future directions - 03/03/25

Doi : 10.1016/j.toxac.2025.01.014 
Michael Scholz
 Institute of Forensic Medicine, University of Zurich, Zurich, Switzerland 

Résumé

Giving a basic overview of principles of machine learning and its pitfalls together with real world successful examples. This should help improve technological literacy of machine learning within the forensic toxicologist community.

The demands on a forensic toxicologist are changing rapidly. In the past, it was sufficient to operate a GC-MS or LC-MS device with often extremely user-unfriendly software to obtain a result. Then the evaluation of a case could begin. However, as analytical instruments have become faster, more sensitive, versatile and powerful, forensic toxicology has evolved in parallel. This development has been accompanied by a rapid increase in the volume of data. This trend is particularly evident in high-resolution mass spectrometry and non-targeted search analysis, in which a large number of substances can be detected in complex biological samples. Forensic toxicologists are no longer interested only in prescription or illegal drugs, but in the totality of all small molecules in the human body (the so-called metabolome). Under certain circumstances, changes in the metabolome can provide clues to drug use, cause of death, drunk or even drowsy driving. It is obvious that these huge amounts of data can no longer be analyzed manually.

Machine learning (ML), a subfield of artificial intelligence, has proven to be extremely powerful and promising in tackling large, complex, and high-dimensional data sets. ML can make predictions, find patterns, or classify data. The three-machine learning types are supervised, unsupervised, and reinforcement learning. It has emerged over the last decade, and consists of many different learning algorithms (e.g. Linear Regression, Logistic Regression, Decision Trees, Random Forest, Support Vector Machines, Naive Bayes and others). Currently, these algorithms are finding their way into forensic toxicology. However, this transformative technology is not without its challenges. While the underlying principles of ML are easy to understand, there are a lot of pitfalls to avoid ensuring that ML can actually improve results in forensic toxicology. There are so many easy-to-make mistakes that can cause an ML model to appear to perform well, when in reality it does not.

The most common pitfalls are: inadequate or non-representative training data, poor quality of data or overfitting and underfitting. It is of the utmost importance to correctly split datasets, train algorithms, and validate results. Another problem that severely impacts machine-learning algorithms is the curse of dimensionality, a phenomenon where the efficiency and effectiveness of algorithms deteriorate as the dimensionality of the data increases exponentially. Consequently, the skilled forensic toxicologist must employ dimensionality reduction techniques such as selection of the most relevant features from the original dataset while discarding irrelevant or redundant ones (feature selection). This reduces the dimensionality of the data, simplifying the model and improving its efficiency. One can also transform the original high-dimensional data into a lower-dimensional space by creating new features that capture the essential information (feature extraction). It also helps to scale the features to a similar range to prevent certain features from dominating others, especially in distance-based algorithms. To further ensure robustness in the model training process, missing data should be addressed appropriately through imputation or deletion.

Examples of successful implementation of ML in forensic toxicology: the combination of machine learning and (high-resolution) mass spectrometry offers incredible synergy that can be harnessed to optimize workflows by detection of sample adulteration, improve detection of difficult analyte groups (e.g. synthetic cannabinoid receptor agonists, SCRAs), and optimize processing of high-dimensional data sets. This approach can help with even the most complex problems in our field, such as detecting the effects of sleepiness on the metabolome and establishing biomarkers of sleepiness.

Le texte complet de cet article est disponible en PDF.

Plan


© 2025  Publié par Elsevier Masson SAS.
Ajouter à ma bibliothèque Retirer de ma bibliothèque Imprimer
Export

    Export citations

  • Fichier

  • Contenu

Vol 37 - N° 1S

P. S14-S15 - mars 2025 Retour au numéro
Article précédent Article précédent
  • The use of artificial intelligence in forensic toxicology
  • Simon Elliott, Sarah MR Wille
| Article suivant Article suivant
  • Expanding the post-mortem toxicological toolbox: A focus on vitreous humour metabolomics
  • Leen Jacobs

Bienvenue sur EM-consulte, la référence des professionnels de santé.
L’accès au texte intégral de cet article nécessite un abonnement.

Déjà abonné à cette revue ?

Elsevier s'engage à rendre ses eBooks accessibles et à se conformer aux lois applicables. Compte tenu de notre vaste bibliothèque de titres, il existe des cas où rendre un livre électronique entièrement accessible présente des défis uniques et l'inclusion de fonctionnalités complètes pourrait transformer sa nature au point de ne plus servir son objectif principal ou d'entraîner un fardeau disproportionné pour l'éditeur. Par conséquent, l'accessibilité de cet eBook peut être limitée. Voir plus

Mon compte


Plateformes Elsevier Masson

Déclaration CNIL

EM-CONSULTE.COM est déclaré à la CNIL, déclaration n° 1286925.

En application de la loi nº78-17 du 6 janvier 1978 relative à l'informatique, aux fichiers et aux libertés, vous disposez des droits d'opposition (art.26 de la loi), d'accès (art.34 à 38 de la loi), et de rectification (art.36 de la loi) des données vous concernant. Ainsi, vous pouvez exiger que soient rectifiées, complétées, clarifiées, mises à jour ou effacées les informations vous concernant qui sont inexactes, incomplètes, équivoques, périmées ou dont la collecte ou l'utilisation ou la conservation est interdite.
Les informations personnelles concernant les visiteurs de notre site, y compris leur identité, sont confidentielles.
Le responsable du site s'engage sur l'honneur à respecter les conditions légales de confidentialité applicables en France et à ne pas divulguer ces informations à des tiers.


Tout le contenu de ce site: Copyright © 2026 Elsevier, ses concédants de licence et ses contributeurs. Tout les droits sont réservés, y compris ceux relatifs à l'exploration de textes et de données, a la formation en IA et aux technologies similaires. Pour tout contenu en libre accès, les conditions de licence Creative Commons s'appliquent.