Archive ouverte UNIGE | last documents for author 'Ke Sun'https://archive-ouverte.unige.ch/Latest objects deposited in the Archive ouverte UNIGE for author 'Ke Sun'engOn Hölder Projective Divergenceshttps://archive-ouverte.unige.ch/unige:103033https://archive-ouverte.unige.ch/unige:103033We describe a framework to build distances by measuring the tightness of inequalities and introduce the notion of proper statistical divergences and improper pseudo-divergences. We then consider the Hölder ordinary and reverse inequalities and present two novel classes of Hölder divergences and pseudo-divergences that both encapsulate the special case of the Cauchy–Schwarz divergence. We report closed-form formulas for those statistical dissimilarities when considering distributions belonging to the same exponential family provided that the natural parameter space is a cone (e.g., multivariate Gaussians) or affine (e.g., categorical distributions). Those new classes of Hölder distances are invariant to rescaling and thus do not require distributions to be normalized. Finally, we show how to compute statistical Hölder centroids with respect to those divergences and carry out center-based clustering toy experiments on a set of Gaussian distributions which demonstrate empirically that symmetrized Hölder divergences outperform the symmetric Cauchy–Schwarz divergence.Tue, 20 Mar 2018 11:04:07 +0100Information geometry and data manifold representationshttps://archive-ouverte.unige.ch/unige:80017https://archive-ouverte.unige.ch/unige:80017Information geometry studies the measurements of intrinsic information based on the mathematical discipline of differential geometry. This dissertation built several new connections between information geometry and machine learning. A Riemannian geometry of a representation manifold R is studied, where each point is a pair-wise (dis-)similarity matrix corresponding to a set of objects. A Riemannian metric of R is derived in closed form based on the Fisher information metric. This metric is featured by emphasizing local information. Information geometric measurements for manifold learning are therefore defined. Statistical mixture learning of the data manifold is viewed as a multi-body problem on a statistical manifold M, where each body is a mixture component, or a point on M. Principles based on geometric compactness can lead to effective regularization for both kernel density estimation and parametric mixture learning. New insights are given on the connection between information geometry, mixture learning, and minimum description length.Mon, 25 Jan 2016 14:34:38 +0100An Information Geometry of Statistical Manifold Learninghttps://archive-ouverte.unige.ch/unige:73194https://archive-ouverte.unige.ch/unige:73194Manifold learning seeks low-dimensional representations of high-dimensional data. The main tactics have been exploring the geometry in an input data space and an output embedding space. We develop a manifold learning theory in a hypothesis space consisting of models. A model means a specific instance of a collection of points, e.g., the input data collectively or the output embedding collectively. The semi-Riemannian metric of this hypothesis space is uniquely derived in closed form based on the information geometry of probability distributions. There, manifold learning is interpreted as a trajectory of intermediate models. The volume of a continuous region reveals an amount of information. It can be measured to define model complexity and embedding quality. This provides deep unified perspectives of manifold learning theory.Mon, 15 Jun 2015 13:48:42 +0200Information geometry and minimum description length networkshttps://archive-ouverte.unige.ch/unige:73193https://archive-ouverte.unige.ch/unige:73193We study parametric unsupervised mixture learning. We measure the loss of intrinsic information from the observations to complex mixture models, and then to simple mixture models. We present a geometric picture, where all these representations are regarded as free points in the space of probability distributions. Based on minimum description length, we derive a simple geometric principle to learn all these models together. We present a new learning machine with theories, algorithms, and simulations.Mon, 15 Jun 2015 13:47:55 +0200Sparsity on Statistical Simplexes and Diversity in Social Rankinghttps://archive-ouverte.unige.ch/unige:73192https://archive-ouverte.unige.ch/unige:73192Sparsity in R^m has been widely explored in machine learning. We study sparsity on a statistical simplex consisting of all categorical distributions. This is different from the case in R^m because such a simplex is a Riemannian manifold, a curved space. A learner with sparse constraints should be likely to fall to its low-dimensional boundaries. We present a novel analysis on the statistical simplex as a manifold with boundary. The main contribution is an explicit view of the learning dynamics in between high-dimensional models in the interior of the simplex and low-dimensional models on its boundaries. We prove the differentiability of the cost function, the natural gradient with respect to the Riemannian structure, and convexity around the singular regions. We uncover an interesting relationship with L1 regularization. We apply the proposed technique to social network analysis. Given a directed graph, the task is to rank a subset of influencer nodes. Here, sparsity means that the top-ranked nodes should present diversity in the sense of minimizing influence overlap. We present a ranking algorithm based on the natural gradient. It can scale up to graph datasets with millions of nodes. On real large networks, the top-ranked nodes are the most informative among several commonly-used techniques.Mon, 15 Jun 2015 13:46:39 +0200Information Geometric Density Estimationhttps://archive-ouverte.unige.ch/unige:73191https://archive-ouverte.unige.ch/unige:73191We investigate kernel density estimation where the kernel function varies from point to point. Density estimation in the input space means to find a set of coordinates on a statistical manifold. This novel perspective helps to combine efforts from information geometry and machine learning to spawn a family of density estimators. We present example models with simulations. We discuss the principle and theory of such density estimation.Mon, 15 Jun 2015 13:45:15 +0200