en
Scientific article
Open access
English

Online health search via multi-dimensional information quality assessment based on deep language models

Published inJMIR AI, vol. 3, e42630
Publication date2024-05-02
First online date2024-01-15
Abstract

Background: Widespread misinformation in web resources can lead to serious implications for individuals seeking health advice. Despite that, information retrieval models are often focused only on the query-document relevance dimension to rank results.

Objective: We investigate a multidimensional information quality retrieval model based on deep learning to enhance the effectiveness of online health care information search results.

Methods: In this study, we simulated online health information search scenarios with a topic set of 32 different health-related inquiries and a corpus containing 1 billion web documents from the April 2019 snapshot of Common Crawl. Using state-of-the-art pretrained language models, we assessed the quality of the retrieved documents according to their usefulness, supportiveness, and credibility dimensions for a given search query on 6030 human-annotated, query-document pairs. We evaluated this approach using transfer learning and more specific domain adaptation techniques.

Results: In the transfer learning setting, the usefulness model provided the largest distinction between help- and harm-compatible documents, with a difference of +5.6%, leading to a majority of helpful documents in the top 10 retrieved. The supportiveness model achieved the best harm compatibility (+2.4%), while the combination of usefulness, supportiveness, and credibility models achieved the largest distinction between help- and harm-compatibility on helpful topics (+16.9%). In the domain adaptation setting, the linear combination of different models showed robust performance, with help-harm compatibility above +4.4% for all dimensions and going as high as +6.8%.

Conclusions: These results suggest that integrating automatic ranking models created for specific information quality dimensions can increase the effectiveness of health-related information retrieval. Thus, our approach could be used to enhance searches made by individuals seeking online health information.

eng
Keywords
  • Deep learning
  • Health misinformation
  • Infodemic
  • Information retrieval
  • Language model
  • Transfer learning
Funding
  • Innosuisse - [101.466 IP-ICT]
  • Innosuisse - [55441.1 IP-ICT]
Citation (ISO format)
ZHANG, Boya et al. Online health search via multi-dimensional information quality assessment based on deep language models. In: JMIR AI, 2024, vol. 3, p. e42630. doi: 10.2196/42630
Main files (2)
Article (Published version)
Article (Accepted version)
Secondary files (3)
Identifiers
ISSN of the journal2817-1705
26views
7downloads

Technical informations

Creation01/18/2024 3:14:21 PM
First validation06/21/2024 10:01:24 AM
Update time07/04/2024 2:14:47 PM
Status update07/04/2024 2:14:47 PM
Last indexation07/04/2024 2:14:52 PM
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack