Scientific article
OA Policy
English

Information Retrieval in an Infodemic: The Case of COVID-19 Publications

Published inJMIR. Journal of medical internet research, vol. 23, no. 9, e30161
Publication date2021-09-17
First online date2021-09-17
Abstract

Background: The COVID-19 global health crisis has led to an exponential surge in published scientific literature. In an attempt to tackle the pandemic, extremely large COVID-19-related corpora are being created, sometimes with inaccurate information, which is no longer at scale of human analyses.

Objective: In the context of searching for scientific evidence in the deluge of COVID-19-related literature, we present an information retrieval methodology for effective identification of relevant sources to answer biomedical queries posed using natural language.

Methods: Our multistage retrieval methodology combines probabilistic weighting models and reranking algorithms based on deep neural architectures to boost the ranking of relevant documents. Similarity of COVID-19 queries is compared to documents, and a series of postprocessing methods is applied to the initial ranking list to improve the match between the query and the biomedical information source and boost the position of relevant documents.

Results: The methodology was evaluated in the context of the TREC-COVID challenge, achieving competitive results with the top-ranking teams participating in the competition. Particularly, the combination of bag-of-words and deep neural language models significantly outperformed an Okapi Best Match 25-based baseline, retrieving on average, 83% of relevant documents in the top 20.

Conclusions: These results indicate that multistage retrieval supported by deep learning could enhance identification of literature for COVID-19-related questions posed using natural language.

Keywords
  • COVID-19
  • Coronavirus
  • Deep learning
  • Infodemic
  • Infodemiology
  • Information retrieval
  • Literature
  • Multistage retrieval
  • Neural search
  • Online information
  • Algorithms
  • Humans
  • Information Storage and Retrieval
  • Language
  • SARS-CoV-2
Funding
  • Innosuisse - [41013.1 IP-ICT]
  • European Commission - Common Infrastructure for National Cohorts in Europe, Canada, and Africa [825775]
Citation (ISO format)
TEODORO, Douglas et al. Information Retrieval in an Infodemic: The Case of COVID-19 Publications. In: JMIR. Journal of medical internet research, 2021, vol. 23, n° 9, p. e30161. doi: 10.2196/30161
Main files (1)
Article (Published version)
Identifiers
Journal ISSN1438-8871
166views
447downloads

Technical informations

Creation13/10/2021 14:42:00
First validation13/10/2021 14:42:00
Update time16/03/2023 03:20:58
Status update16/03/2023 03:20:57
Last indexation01/11/2024 01:28:06
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack