Improving the capabilities of Variational Autoencoder Models by exploring their latent space

Lavda, Frantzeska

doi:10.13097/archive-ouverte/unige:178589

A fundamental goal in developing of Machine Learning is to build systems that are able to mimic the capabilities of humans and animals. This drives the field towards the creation of algorithms and architectures capable of mastering a vast array of tasks, from basic pattern recognition to complex decision-making under and drug design uncertain conditions. Recent advancements in deep learning have shown that neural networks can make significant advances by using large amount of data and computing power. The last decade deep generative models, have achieve great achievements in fields like computer vision, natural language processing, that we could not even imagine two decades ago. These achievements highlight the potential of deep learning and underscore the ongoing effort to narrow the gap between human intelligence and machine intelligence. This thesis delves into the advancement of deep generative modeling, particularly focusing on Variational Autoencoders (VAEs), to tackle significant challenges such as out-ofdistribution (OOD) generation, catastrophic forgetting, and the learning of multi-modal probabilistic structures. Inspired by human cognitive abilities to learn from minimal observations and adapt to new environments, our work seeks to learn similar capabilities within machine learning models, thereby narrowing the gap between human intelligence and artificial intelligence. Through three main contributions, we address limitations of current generative modeling approaches and propose solutions to improve their performance. We explore first, the ability of VAEs to achieve OOD conditional generations. Although conditional generation is already a challenging task because the model might ignore these conditions, our research goes further into a more complex task. As humans’ brains are able to understand and produce new combinations of familiar elements, we develop a novel framework that is capable of generating data with desired property values combinations not included in the training data. Our method, leveraging conditional VAEs with a back-translation mechanism, can handle a diverse range of input–attribute pairs that may not be present in the training data, thus enhancing its capability to handle OOD data. Moreover, the back-translation procedure preserves the content of the input data while manipulating their attribute values, enabling style transfer. Then, we examine another challenging task for ML, namely, continual classification learning. In this thesis, we tackle this challenge by introducing a joint generative model approach, combining naturally a generative model with a classifier in the latent space, relying on the joint generative model to replicate the data distribution with the corresponding labels of the previously seen tasks. Finally, we study the limitations of VAEs, focusing on their inability to produce generations from the individual modalities of data originating from mixture distributions, reflecting humans’ ability to understand and process complex, heterogeneous information. To address this, we propose a 2-level hierarchical latent variable model, which introduces both continuous and categorical latent variables, thereby offering a richer representation of data. By integrating a more flexible variational posterior and an informative conditional prior, mirroring the same structure, our method substantially improves the model’s capacity for capturing and generating the complex probabilistic structures.

Archive ouverte UNIGE

Improving the capabilities of Variational Autoencoder Models by exploring their latent space

Technical informations