fr
Thèse
Accès libre
Anglais

Distance and kernel based learning over composite representations

Contributeurs/tricesWoznica, Adam
Date de soutenance2008-09-03
Résumé

The goal of this dissertation is to examine various aspects of the distance- and kernel-based learning paradigms applied in relational settings. We start by introducing a multi-relational representation formalism, at the core of which lie the concepts of tuples, sets and lists, which we implemented over the relational algebra language. By combining these data types we are able to model a variety of composite objects, such as trees and graphs. We proceed with the definition of various distance and kernel operators over the representation formalism. We are not constrained to a specific operator, instead, we are free to assign an operator, selected from a set of available operators, to a particular data type. The final operator over the composite objects is given as a recursive combination of operators assigned to the sub-structures which constitute the objects. We focus on mapping-based operators defined over sets. Next, we propose three new and flexible families of set kernels where the overall similarity is based on mappings between the elements of the two sets. These kernels differ from most of the existing set kernels which are based on averaging of the similarities of all the elements of the two sets. Finally, we propose a general framework for adaptively selecting representations of complex data and/or operators over representations. More precisely, our framework assumes a set of predefined representations and operators which are then combined in an optimal way. We focus only on the distance-based paradigm and we exploit previous work on metric learning over vectorial data. We use the optimal combination of different graph decompositions into substructures of specific types to define adaptive graph kernels which address the limitations of the existing kernels over these complex structures. We undertook extensive comparisons of our distance- and kernel-based relational system, which included: an empirical evaluation of various composite distances, with the focus on comparison of set distances based on mappings; an empirical evaluation of different complex kernels, with the focus on comparison of set kernels based on averaging and set kernels based on various mappings; an empirical evaluation of our adaptive framework for the tasks of combination of set distances, and combination of various graph decompositions into substructures of various types. The empirical evaluation of the system has shown that the proposed distance- and kernel-based paradigms are effective over a number of relational benchmark datasets. Additionally, in all of the examined relational problems we achieved state-of-the-art results which are better than the best results obtained using other relational systems.

eng
Mots-clés
  • Machine learning
  • Data mining
  • Complex structures
  • Graphs
  • Sets
  • Lists
  • Distance measures
  • Kernels
  • Support vector machines
  • SVM
  • Instance based learning
  • K nearest neighbor
  • KNN
  • Automatic classification of chemical molecules
  • Bioinformatics
Citation (format ISO)
WOZNICA, Adam. Distance and kernel based learning over composite representations. 2008. doi: 10.13097/archive-ouverte/unige:5483
Fichiers principaux (1)
Thesis
accessLevelPublic
Identifiants
718vues
1915téléchargements

Informations techniques

Création17.03.2010 16:31:00
Première validation17.03.2010 16:31:00
Heure de mise à jour14.03.2023 15:25:19
Changement de statut14.03.2023 15:25:19
Dernière indexation29.01.2024 18:55:26
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack