Loading Events

« All Events

  • This event has passed.

Ph.D. Thesis Colloquium

October 28 @ 11:00 AM - 1:00 PM IST

PhD Thesis Colloquium

Name of the Candidate: Kalla Jayateja
Research Supervisor: Soma Biswas
Date and Time: October 28, 2024, Monday, 11:00 AM
Venue: C-241, First Floor, Multimedia Classroom (MMCR), EE

Title: Class Incremental Learning Across Diverse Data Paradigms

Abstract: In recent years, deep learning has achieved remarkable success in various domains, largely due to its ability to learn from vast amounts of data. However, traditional deep learning models struggle in scenarios where new classes are introduced over time, requiring retraining from scratch or facing catastrophic forgetting of previously learned information. This limitation underscores the need for class incremental learning (CIL), a continual learning paradigm that enables models to adapt incrementally to new classes without losing prior knowledge. CIL is crucial in real-world scenarios, such as autonomous driving and healthcare diagnostics, where new data emerges continuously. Traditional CIL approaches often rely on idealized assumptions of balanced, fully labeled, and abundant datasets, which rarely hold true in practice. In reality, CIL models must handle dynamic environments like class imbalance, limited supervision, and data scarcity. This thesis tackles these issues by proposing novel methods tailored to diverse CIL scenarios, emphasizing flexibility and robustness. We now describe the various CIL scenarios studied as part of this thesis.

Firstly, we introduce the Generalized Semi-Supervised Class Incremental Learning (GSS-CIL) protocol, designed for scenarios with limited labeled data and abundant unlabeled data. In semi-supervised learning, the quality of pseudo-labels plays a critical role. To address this challenge within the CIL framework, we propose the Expert Suggested Pseudo-Labelling Network (ESPN), which utilizes an expert model to generate high-quality pseudo-labels from the unlabeled data at each incremental step, ensuring a more robust learning process.

In many practical applications, the number of samples per class can vary significantly, leading to long-tailed distributions where a few classes are well-represented, while most others are under-represented. This motivates the need for addressing long-tailed learning in CIL which stems from the inherent imbalance in real-world data distributions. We address this problem through a two-stage framework called Global Variance-Driven Classifier Alignment (GVAlign), where the first stage involves learning robust feature representations using Mixup loss. In the second stage, the classifiers are aligned by leveraging global variance with class prototypes, enabling learning robust representations even for under-represented classes. GVAlign can be seamlessly integrated into existing CIL approaches to effectively handle the long tailed data distributions.

In the next part, we address the Few-Shot Class Incremental Learning (FSCIL) scenario, where there are only a handful of examples available for each class. We address the two key challenges of FSCIL, namely overfitting and catastrophic forgetting, through the proposed method, Self-Supervised Stochastic Classifier (S3C). In order to learn robust feature representations in the limited data regime and prevent overfitting, we leverage self-supervised objectives. Specifically, we train the feature extractor for the rotation prediction task. We observe that the network learnt in a self-supervised manner mitigates catastrophic forgetting in the incremental stages. We also propose to replace the conventional deterministic classifiers with stochastic classifiers, where classifiers are sampled from a learnable distribution. This further aids the model in generalizing better to new classes and mitigates overfitting, thereby improving performance in FSCIL scenarios.

In addition to addressing these specific CIL scenarios, this thesis also focuses on the development of generalized methods that are adaptable across the variety of CIL scenarios and the amount of data supervision. Given the diversity inherent in incremental learning, a single method may not suffice for all scenarios. We demonstrate that a straightforward self-supervision strategy can significantly enhance performance across multiple CIL tasks, enabling our models to remain adaptable without the need for task-specific modifications. This approach, being modular in nature, can be seamlessly integrated with new techniques as they emerge.

In the final part of this thesis, we propose a unified approach to address CIL across varying levels of supervision, from few-shot to high-shot settings. By harnessing the rich representational capabilities of large-scale pre-trained models, our method effectively handles the challenges posed by differing levels of supervision, ensuring robust performance in both low-shot and high-shot CIL scenarios.

Details

Date:
October 28
Time:
11:00 AM - 1:00 PM IST

Venue

MMCR, Hall C 241, 1st floor, EE department