en
Scientific article
English

Learning from imbalanced data in surveillance of nosocomial infection

Published inArtificial intelligence in medicine, vol. 37, no. 1, p. 7-18
Publication date2006
Abstract

OBJECTIVE: An important problem that arises in hospitals is the monitoring and detection of nosocomial or hospital acquired infections (NIs). This paper describes a retrospective analysis of a prevalence survey of NIs done in the Geneva University Hospital. Our goal is to identify patients with one or more NIs on the basis of clinical and other data collected during the survey. METHODS AND MATERIAL: Standard surveillance strategies are time-consuming and cannot be applied hospital-wide; alternative methods are required. In NI detection viewed as a classification task, the main difficulty resides in the significant imbalance between positive or infected (11%) and negative (89%) cases. To remedy class imbalance, we explore two distinct avenues: (1) a new re-sampling approach in which both over-sampling of rare positives and under-sampling of the noninfected majority rely on synthetic cases (prototypes) generated via class-specific sub-clustering, and (2) a support vector algorithm in which asymmetrical margins are tuned to improve recognition of rare positive cases. RESULTS AND CONCLUSION: Experiments have shown both approaches to be effective for the NI detection problem. Our novel re-sampling strategies perform remarkably better than classical random re-sampling. However, they are outperformed by asymmetrical soft margin support vector machines which attained a sensitivity rate of 92%, significantly better than the highest sensitivity (87%) obtained via prototype-based re-sampling.

Keywords
  • Algorithms
  • Artificial Intelligence
  • Cluster Analysis
  • Cross Infection/ epidemiology
  • Cross-Sectional Studies
  • Hospitals, University
  • Humans
  • Infection Control
  • Models, Statistical
  • Population Surveillance/ methods
  • ROC Curve
  • Retrospective Studies
  • Switzerland/epidemiology
Citation (ISO format)
COHEN, Gilles et al. Learning from imbalanced data in surveillance of nosocomial infection. In: Artificial intelligence in medicine, 2006, vol. 37, n° 1, p. 7–18. doi: 10.1016/j.artmed.2005.03.002
Main files (1)
Article (Published version)
accessLevelRestricted
Identifiers
ISSN of the journal0933-3657
558views
2downloads

Technical informations

Creation21.06.2010 10:23:35
First validation21.06.2010 10:23:35
Update time14.03.2023 15:43:05
Status update14.03.2023 15:43:05
Last indexation15.01.2024 20:09:02
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack