en
Scientific article
Open access
English

A Multilabel Approach to Portuguese Clinical Named Entity Recognition

Published inJournal of Health Informatics, vol. 12, no. Número Especial, XVII Congresso Brasileiro de Informática em Saúde (CBIS 2020),, p. 366-372
Publication date2020-12
Abstract

Objectives: Clinical Named Entity Recognition is a critical Natural Language Processing task, as it could support biomedical research and healthcare systems. While most extracted clinical entities are based on single-label concepts, it is very common in the clinical domain entities with more than one semantic category simultaneously. This work proposes BERT-based models to support multilabel clinical named entity recognition in the Portuguese language. Methods: For the experiment, we used the Label Powerset method applied to the multilabel corpus SemClinBr. Results: We compare our results with a Conditional Random Fields baseline, reaching +2.1 in precision, +11.2 in recall, and +7.4 in F1 with a clinical-biomedical BERT model (BioBERTpt). Conclusion: We achieved higher results for both exact and partial metrics, contributing to the multilabel semantic processing of clinical narratives in Portuguese.

eng
Keywords
  • Clinical Named Entity Recognition
  • Label Powerset
  • BERT
Affiliation Not a UNIGE publication
Citation (ISO format)
DE SOUZA, João Vitor Andrioli et al. A Multilabel Approach to Portuguese Clinical Named Entity Recognition. In: Journal of Health Informatics, 2020, vol. 12, p. 366–372.
Main files (1)
Article (Published version)
accessLevelPublic
Identifiers
  • PID : unige:159572
ISSN of the journal2175-4411
163views
48downloads

Technical informations

Creation01/17/2022 2:33:00 PM
First validation01/17/2022 2:33:00 PM
Update time03/16/2023 2:52:01 AM
Status update03/16/2023 2:52:00 AM
Last indexation05/06/2024 10:23:13 AM
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack