Authors:
Babak Solhjoo
and
Emanuele Rodolà
Affiliation:
Department of Computer Science, Sapienza University of Rome, Via Salaria 113, Rome, Italy
Keyword(s):
Variational Auto Encoders, Artificial Intelligence, Orthogonal Latent Code, Disentanglement, Hybrid, Inter-Domain Signal Transfer, Image, Voice.
Abstract:
Auto Encoders are specific types of Deep Neural Networks that extract latent codes in a lower dimensional space for the inputs that are expressed in the higher dimensions. These latent codes are extracted by forcing the network to generate similar outputs to the inputs while limiting the data that can flow through the network in the latent space by choosing a lower dimensional space (Bank et al., 2020). Variational Auto Encoders realize a similar objective by generating a distribution of the latent codes instead of deterministic latent codes (Cosmo et al., 2020). This work focuses on generating semi-orthogonal variational latent codes for the inputs from different source types such as voice, image, and text for the same objects. The novelty of this work is on aiming to obtain unified variational latent codes for different manifestations of the same objects in the physical world using orthogonal latent codes. In order to achieve this objective, a specific Loss Function has been introd
uced to generate semi-orthogonal and variational latent codes for different objects. Then these orthogonal codes have also been exploited to map different manifestations of the same objects to each other. This work also uses these codes to convert the manifestations from one domain to another one.
(More)