Advances in Deep Learning: Towards Robustness, Multimodality, and Zero-Shot Abilities

Open access
Author
Date
2024Type
- Doctoral Thesis
ETH Bibliography
yes
Altmetrics
Abstract
This thesis focuses on challenges in three critical aspects of deep learning within the area of Artificial Intelligence (AI): robustness, multimodality and zero-shot abilities. Throughout our studies, we show that despite making remarkable process in various downstream applications, deep learning models are still vulnerable to adversarial attacks, which might even pose safety issues when models are deployed for real-world applications. We also demonstrate that by using self-supervised learning techniques, one can improve the robustness of deep learning models without even touching the labels of the input data.
Next we departs from robustness to explore multimodal learning. We leverage a Transformer-based encoder-decoder model for generating commonsense predictions from multimodal inputs of images and texts. This is achieved by our novel pretraining task, which injects to the model commonsense knowledge from an external knowledge graph. We also study 3D Reconstruction with Transformer-based models.
We then for the last time shift to studying the zero-shot learning abilities of pretrained language models. We propose an alternative approach than prompting for zero-shot learning with language models by simply initializing class centers from anchor sentences and performing unsupervised clustering on the sentence embeddings.
We believe the concepts and methods discussed in this thesis can offer valuable insights on robustness, multimodal and zero-shot abilities of deep learning models for future research, particularly in the era of Large Language Models. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000667765Publication status
publishedExternal links
Search print copy at ETH Library
Publisher
ETH ZurichSubject
Natural Language Processing (NLP); Machine Learning; Deep Learning; large language models (LLMs); multimodal learning; Robustness of Deep LearningOrganisational unit
03604 - Wattenhofer, Roger / Wattenhofer, Roger
More
Show all metadata
ETH Bibliography
yes
Altmetrics