Scientific article
Open access

Impact of feature harmonization on radiogenomics analysis: Prediction of EGFR and KRAS mutations from non-small cell lung cancer PET/CT images

Published inComputers in biology and medicine, vol. 142, 105230
Publication date2022-03
First online date2022-01-11

Objective: To investigate the impact of harmonization on the performance of CT, PET, and fused PET/CT radiomic features toward the prediction of mutations status, for epidermal growth factor receptor (EGFR) and Kirsten rat sarcoma viral oncogene (KRAS) genes in non-small cell lung cancer (NSCLC) patients.

Methods: Radiomic features were extracted from tumors delineated on CT, PET, and wavelet fused PET/CT images obtained from 136 histologically proven NSCLC patients. Univariate and multivariate predictive models were developed using radiomic features before and after ComBat harmonization to predict EGFR and KRAS mutation statuses. Multivariate models were built using minimum redundancy maximum relevance feature selection and random forest classifier. We utilized 70/30% splitting patient datasets for training/testing, respectively, and repeated the procedure 10 times. The area under the receiver operator characteristic curve (AUC), accuracy, sensitivity, and specificity were used to assess model performance. The performance of the models (univariate and multivariate), before and after ComBat harmonization was compared using statistical analyses.

Results: While the performance of most features in univariate modeling was significantly improved for EGFR prediction, most features did not show any significant difference in performance after harmonization in KRAS prediction. Average AUCs of all multivariate predictive models for both EGFR and KRAS were significantly improved (q-value < 0.05) following ComBat harmonization. The mean ranges of AUCs increased following harmonization from 0.87-0.90 to 0.92-0.94 for EGFR, and from 0.85-0.90 to 0.91-0.94 for KRAS. The highest performance was achieved by harmonized F_R0.66_W0.75 model with AUC of 0.94, and 0.93 for EGFR and KRAS, respectively.

Conclusion: Our results demonstrated that regarding univariate modelling, while ComBat harmonization had generally a better impact on features for EGFR compared to KRAS status prediction, its effect is feature-dependent. Hence, no systematic effect was observed. Regarding the multivariate models, ComBat harmonization significantly improved the performance of all radiomics models toward more successful prediction of EGFR and KRAS mutation statuses in lung cancer patients. Thus, by eliminating the batch effect in multi-centric radiomic feature sets, harmonization is a promising tool for developing robust and reproducible radiomics using vast and variant datasets.

  • Artificial intelligence
  • Computed tomography
  • Harmonization
  • Imaging genomics
  • Non-small cell lung cancer
  • Positron emission tomography
  • Carcinoma, Non-Small-Cell Lung / diagnostic imaging
  • Carcinoma, Non-Small-Cell Lung / genetics
  • ErbB Receptors / genetics
  • Humans
  • Lung Neoplasms / diagnostic imaging
  • Lung Neoplasms / genetics
  • Lung Neoplasms / pathology
  • Mutation / genetics
  • Positron Emission Tomography Computed Tomography / methods
  • Proto-Oncogene Proteins p21(ras) / genetics
Citation (ISO format)
SHIRI LORD, Isaac et al. Impact of feature harmonization on radiogenomics analysis: Prediction of EGFR and KRAS mutations from non-small cell lung cancer PET/CT images. In: Computers in biology and medicine, 2022, vol. 142, p. 105230. doi: 10.1016/j.compbiomed.2022.105230
Main files (1)
Article (Published version)
Secondary files (1)
ISSN of the journal0010-4825

Technical informations

Creation06/16/2022 7:59:00 AM
First validation06/16/2022 7:59:00 AM
Update time03/16/2023 8:53:53 AM
Status update03/16/2023 8:53:51 AM
Last indexation05/06/2024 12:00:04 PM
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack