UNIGE document Doctoral Thesis
previous document  unige:5483  next document
add to browser collection

Distance and kernel based learning over composite representations

Defense Thèse de doctorat : Univ. Genève, 2008 - Sc. 4013 - 2008/09/03
Abstract The goal of this dissertation is to examine various aspects of the distance- and kernel-based learning paradigms applied in relational settings. We start by introducing a multi-relational representation formalism, at the core of which lie the concepts of tuples, sets and lists, which we implemented over the relational algebra language. By combining these data types we are able to model a variety of composite objects, such as trees and graphs. We proceed with the definition of various distance and kernel operators over the representation formalism. We are not constrained to a specific operator, instead, we are free to assign an operator, selected from a set of available operators, to a particular data type. The final operator over the composite objects is given as a recursive combination of operators assigned to the sub-structures which constitute the objects. We focus on mapping-based operators defined over sets. Next, we propose three new and flexible families of set kernels where the overall similarity is based on mappings between the elements of the two sets. These kernels differ from most of the existing set kernels which are based on averaging of the similarities of all the elements of the two sets. Finally, we propose a general framework for adaptively selecting representations of complex data and/or operators over representations. More precisely, our framework assumes a set of predefined representations and operators which are then combined in an optimal way. We focus only on the distance-based paradigm and we exploit previous work on metric learning over vectorial data. We use the optimal combination of different graph decompositions into substructures of specific types to define adaptive graph kernels which address the limitations of the existing kernels over these complex structures. We undertook extensive comparisons of our distance- and kernel-based relational system, which included: an empirical evaluation of various composite distances, with the focus on comparison of set distances based on mappings; an empirical evaluation of different complex kernels, with the focus on comparison of set kernels based on averaging and set kernels based on various mappings; an empirical evaluation of our adaptive framework for the tasks of combination of set distances, and combination of various graph decompositions into substructures of various types. The empirical evaluation of the system has shown that the proposed distance- and kernel-based paradigms are effective over a number of relational benchmark datasets. Additionally, in all of the examined relational problems we achieved state-of-the-art results which are better than the best results obtained using other relational systems.
Keywords Machine learningData miningComplex structuresGraphsSetsListsDistance measuresKernelsSupport vector machinesSVMInstance based learningK nearest neighborKNNAutomatic classification of chemical moleculesBioinformatics
URN: urn:nbn:ch:unige-54832
Full text
Thesis (2.9 MB) - public document Free access
(ISO format)
WOZNICA, Adam. Distance and kernel based learning over composite representations. Université de Genève. Thèse, 2008. https://archive-ouverte.unige.ch/unige:5483

265 hits



Deposited on : 2010-03-18

Export document
Format :
Citation style :