fr
Chapitre d'actes
Accès libre
Anglais

Cluster analysis of low-dimensional medical concept representations from Electronic Health Records

Publié dansHealth Information Science. HIS 2022., p. 313-324
Présenté à 11th International Conference on Health Information Science (HIS 2022), Online, 28–30 October 2022
Maison d'éditionCham : Springer Nature
Collection
  • Lecture Notes in Computer Science; 13705
Date de mise en ligne2022-10-25
Résumé

The study of existing links among different types of medical concepts can support research on optimal pathways for the treatment of human diseases. Here, we present a clustering analysis of medical concept learned representations generated from MIMIC-IV, an open dataset of de-identified digital health records. Patient’s trajectory information were extracted in chronological order to generate +500k sequence-like data structures, which were fed to a word2vec model to automatically learn concept representations. As a result, we obtained concept embeddings that describe diagnostics, procedures, and medications in a continuous low-dimensional space. A quantitative evaluation of the embeddings shows the significant power of the extracted embeddings on predicting exact labels of diagnoses, procedures, and medications for a given patient trajectory, achieving top-10 and top-30 accuracy over 47% and 66%, respectively, for all the dimensions evaluated. Moreover, clustering analyses of medical concepts after dimensionality reduction with t-SNE and UMAP techniques show that similar diagnoses (and procedures) are grouped together matching the categories of ICD-10 codes. However, the distribution by categories is not as evident if PCA or SVD are employed, indicating that the relationships among concepts are highly non-linear. This highlights the importance of non-linear models, such as those provided by deep learning, to capture the complex relationships of medical concepts.

eng
Mots-clés
  • Electronic health records
  • Patient trajectory
  • Embeddings
  • Clustering
  • Representation learning
Citation (format ISO)
JAUME SANTERO, Fernando et al. Cluster analysis of low-dimensional medical concept representations from Electronic Health Records. In: Health Information Science. HIS 2022. Online. Cham : Springer Nature, 2022. p. 313–324. (Lecture Notes in Computer Science) doi: 10.1007/978-3-031-20627-6_29
Fichiers principaux (1)
Proceedings chapter (Accepted version)
accessLevelPublic
Identifiants
ISBN978-3-031-20627-6
276vues
40téléchargements

Informations techniques

Création28.09.2022 10:16:00
Première validation28.09.2022 10:16:00
Heure de mise à jour16.03.2023 08:53:40
Changement de statut16.03.2023 08:53:39
Dernière indexation01.02.2024 09:08:51
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack