UNIGE document Scientific Article
previous document  unige:45200  next document
add to browser collection

Use of support vector machines for disease risk prediction in genome-wide association studies: concerns and opportunities

Mittag, Florian
Büchel, Finja
Saad, Mohamad
Jahn, Andreas
Schulte, Claudia
Bochdanovits, Zoltan
Simón-Sánchez, Javier
Nalls, Mike A
show hidden authors show all authors [1 - 21]
CollaborationWith : Pollak, Pierre
Published in Human mutation. 2012, vol. 33, no. 12, p. 1708-18
Abstract The success of genome-wide association studies (GWAS) in deciphering the genetic architecture of complex diseases has fueled the expectations whether the individual risk can also be quantified based on the genetic architecture. So far, disease risk prediction based on top-validated single-nucleotide polymorphisms (SNPs) showed little predictive value. Here, we applied a support vector machine (SVM) to Parkinson disease (PD) and type 1 diabetes (T1D), to show that apart from magnitude of effect size of risk variants, heritability of the disease also plays an important role in disease risk prediction. Furthermore, we performed a simulation study to show the role of uncommon (frequency 1-5%) as well as rare variants (frequency <1%) in disease etiology of complex diseases. Using a cross-validation model, we were able to achieve predictions with an area under the receiver operating characteristic curve (AUC) of ~0.88 for T1D, highlighting the strong heritable component (∼90%). This is in contrast to PD, where we were unable to achieve a satisfactory prediction (AUC ~0.56; heritability ~38%). Our simulations showed that simultaneous inclusion of uncommon and rare variants in GWAS would eventually lead to feasible disease risk prediction for complex diseases such as PD. The used software is available at http://www.ra.cs.uni-tuebingen.de/software/MACLEAPS/.
Keywords Area Under CurveBipolar Disorder/diagnosis/geneticsCase-Control StudiesComputer SimulationDiabetes Mellitus, Type 1/diagnosis/geneticsDiabetes Mellitus, Type 2/diagnosis/geneticsGenetic Predisposition to DiseaseGenome-Wide Association Study/methodsHumansModels, GeneticParkinson Disease/diagnosis/geneticsPolymorphism, Single NucleotideROC CurveRiskSoftwareSupport Vector Machines
PMID: 22777693
Full text
Article (Published version) (376 Kb) - document accessible for UNIGE members only Limited access to UNIGE
Research group Maladie de Parkinson (911)
(ISO format)
MITTAG, Florian et al. Use of support vector machines for disease risk prediction in genome-wide association studies: concerns and opportunities. In: Human mutation, 2012, vol. 33, n° 12, p. 1708-18. doi: 10.1002/humu.22161 https://archive-ouverte.unige.ch/unige:45200

470 hits

0 download


Deposited on : 2015-01-14

Export document
Format :
Citation style :