Category Adaptation Meets Projected Distillation in Generalized Continual Category Discovery

Abstract: Generalized Continual Category Discovery (GCCD) tackles learning from sequentially arriving, partially labeled datasets while uncovering new categories. Traditional methods depend on feature distillation to prevent forgetting the old knowledge. However, this strategy restricts the model’s ability to adapt and effectively distinguish new categories. To address this, we introduce a novel technique integrating a learnable
projector with feature distillation, thus enhancing model adaptability without sacrificing past knowledge. The resulting distribution shift of the previously learned categories is mitigated with the auxiliary category
adaptation network. We demonstrate that while each component offers modest benefits individually, their combination – dubbed CAMP (Category Adaptation Meets Projected distillation) – significantly improves the
balance between learning new information and retaining old. CAMP exhibits superior performance across several GCCD and Class Incremental Learning scenarios. The code is available on Github.

 

Type of Publication: conference paper

Type of Conference: The 18th European Conference on Computer Vision (ECCV) , Milano, 2024

Authors: Rypeść, Grzegorz; Marczak, Daniel; Cygert, Sebastian; Trzciński, Tomasz; Twardowski

AdaGlimpse: Active Visual Exploration with Arbitrary Glimpse Position and Scale

Abstract: Active Visual Exploration (AVE) is a task that involves dynamically selecting observations (glimpses), which is critical to facilitate comprehension and navigation within an environment. While modern AVE methods have demonstrated impressive performance, they are constrained to fixed-scale glimpses from rigid grids. In contrast, existing mobile platforms equipped with optical zoom capabilities can capture glimpses of arbitrary positions and scales. To address this gap between software and hardware capabilities, we introduce AdaGlimpse. It uses Soft Actor-Critic, a reinforcement learning algorithm tailored for exploration tasks, to select glimpses of arbitrary position and scale. This approach enables our model to rapidly establish a general awareness of the environment before zooming in for detailed analysis. Experimental results demonstrate that AdaGlimpse surpasses previous methods across various visual tasks while maintaining greater applicability in realistic AVE scenarios.

 

Type of Publication: publication

Type of Conference: The 18th European Conference on Computer Vision (ECCV) , Milano, 2024

Authors: Annapureddy, Ravinithesh; Fornaroli, Alessandro; Gatica-Perez, Daniel

Generative AI Literacy: Twelve Defining Competencies

Abstract:

This paper introduces a competency-based model for generative artificial intelligence (AI) literacy covering essential skills and knowledge areas necessary to interact with generative AI. The competencies range from foundational AI literacy to prompt engineering and programming skills, including ethical and legal considerations. These twelve competencies offer a framework for individuals, policymakers, government officials, and educators looking to navigate and take advantage of the potential of generative AI responsibly. Embedding these competencies into educational programs and professional training initiatives can equip individuals to become responsible and informed users and creators of generative AI. The competencies follow a logical progression and serve as a roadmap for individuals seeking to get familiar with generative AI and for researchers and policymakers to develop assessments, educational programs, guidelines, and regulations.

 

Type of Publication: publication

Authors: Annapureddy, Ravinithesh; Fornaroli, Alessandro; Gatica-Perez, Daniel

Trading Volume Maximization with Online Learning

Abstract: We explore brokerage between traders in an online learning framework. At any round t, two traders meet to exchange an asset, provided the exchange is mutually beneficial. The broker proposes a trading price, and each trader tries to sell their asset or buy the asset from the other party, depending on whether the price is higher or lower than their private valuations. A trade happens if one trader is willing to sell and the other is willing to buy at the proposed price.

Previous work provided guidance to a broker aiming at enhancing traders’ total earnings by maximizing the gain from trade, defined as the sum of the traders’ net utilities after each interaction. In contrast, we investigate how the broker should behave to maximize the trading volume, i.e., the total number of trades.

We model the traders’ valuations as an i.i.d. process with an unknown distribution. If the traders’ valuations are revealed after each interaction (full-feedback), and the traders’ valuations cumulative distribution function (cdf) is continuous, we provide an algorithm achieving logarithmic regret and show its optimality up to constant factors.

If only their willingness to sell or buy at the proposed price is revealed after each interaction (2-bitfeedback), we provide an algorithm achieving polylogarithmic regret when the traders’ valuations cdf is Lipschitz and show that this rate is near-optimal.

We complement our results by analyzing the implications of dropping the regularity assumptions on the unknown traders’ valuations cdf. If we drop the continuous cdf assumption, the regret rate degrades to Θ(√T) in the full-feedback case, where T is the time horizon. If we drop the Lipschitz cdf assumption, learning becomes impossible in the 2-bit feedback case.

 

Type of Publication: publication

Authors: Tommaso Cesari; Roberto Colomboni

A Contextual Online Learning Theory of Brokerage

Abstract: We study the role of contextual information in the online learning problem of brokerage between traders. At each round, two traders arrive with secret valuations about an asset they wish to trade. The broker suggests a trading price based on contextual data about the asset. Then, the traders decide to buy or sell depending on whether their valuations are higher or lower than the brokerage price. We assume the market value of traded assets is an unknown line ar function of a d-dimensional vector representing the contextual information available to the broker. Additionally, we model traders’ valuations as independent bounded zero-mean perturbations of the asset’s market value, allowing for potentially different unknown distributions across traders and time steps. Consistently with the existing online learning literature, we evaluate the performance of a learning algorithm with the regret with respect to the gain from trade. If the noise distributions admit densities bounded by someconstant L, then, for any time horizon T:

  • If the agents’ valuations are revealed after each interact ion, we provide an algorithm achieving O(LdlnT) regret, and show a corresponding matching lower bound of Ω(LdlnT).
  • If only their willingness to sell or buy at the proposed price is revealed after each interaction, we provide an algorithm achieving O(√LdTlnT) regret, and show that this rate is optimal (up to logarithmic factors), via a lower bound of Ω(√LdT).
To complete the picture, we show that if the bounded density a ssumption is lifted, then the problem becomes unlearnable, even with full feedback.
 

Type of Publication: publication

Authors: François Bachoc; Tommaso Cesari; Roberto Colomboni

Fair Online Bilateral Trade

Abstract: In online bilateral trade, a platform posts prices to incoming pairs of buyers and sellers that have private valuations for a certain good. If the price is lower than the buyers’ valuation and higher than the sellers’ valuation, then a trade takes place. Previous work focused on the platform perspective, with the goal of setting prices maximizing the gain from trade (the sum of sellers’ and buyers’ utilities). Gain from trade is, however, potentially unfair to traders, as they may receive highly uneven shares of the total utility. In this work we enforce fairness by rewarding the platform with the fair gain from trade, defined as the minimum between sellers’ and buyers’ utilities. After showing that any no-regret learning algorithm designed to maximize the sum of the utilities may fail badly with fair gain from trade, we present our main contribution: a complete characterization of the regret regimes for fair gain from trade when, after each interaction, the platform only learns whether each trader accepted the current price. Specifically, we prove the following regret bounds: Θ(ln T ) in the deterministic setting, Ω(T)in the stochastic setting, and ̃Θ(T2~3)in the stochastic setting when sellers’ and buyers’ valuations are independent of each other. We conclude by providing tight regret bounds when, after each interaction, the platform is allowed to observe the true traders’ valuations.

 

Type of Publication: publication

Authors: François Bachoc; Nicolò Cesa-Bianchi; Tommaso Cesari; Roberto Colomboni

A deep cut into Split Federated Self-supervised Learning

Abstract: Collaborative self-supervised learning has recently become feasible in highly distributed environments by dividing the network layers between client devices and a central server. However, state-of-the-art methods, such as MocoSFL, are optimized for network division at the initial layers, which decreases the protection of the client data and increases communication overhead. In this paper, we demonstrate that splitting depth is crucial for maintaining privacy and communication efficiency in distributed training. We also show that MocoSFL suffers from a catastrophic quality deterioration for the minimal communication overhead. As a remedy, we introduce Momentum-Aligned contrastive Split Federated Learning (MonAcoSFL), which aligns online and momentum client models during training procedure. Consequently, we achieve state-of-the-art accuracy while significantly reducing the communication overhead, making MonAcoSFL more practical in real-world scenarios.

 

Type of Publication: publication

Title of Journal: International Conference on Learning Representations (ICLR), Vienna Austria, 7-11.05.2024

Authors: Marcin Przewięźlikowski; Marcin Osial; Bartosz Zieliński; Marek Śmieja

Divide and not forget: Ensemble of selectively trained experts in Continual Learning

Abstract: Class-incremental learning is becoming more popular as it helps models widen their applicability while not forgetting what they already know. A trend in this area is to use a mixture-of-expert technique, where different models work together to solve the task. However, the experts are usually trained all at once using whole task data, which makes them all prone to forgetting and increasing computational burden. To address this limitation, we introduce a novel approach named SEED. SEED selects only one, the most optimal expert for a considered task, and uses data from this task to fine-tune only this expert. For this purpose, each expert represents each class with a Gaussian distribution, and the optimal expert is selected based on the similarity of those distributions. Consequently, SEED increases diversity and heterogeneity within the experts while maintaining the high stability of this ensemble method. The extensive experiments demonstrate that SEED achieves state-of-the-art performance in exemplar-free settings across various scenarios, showing the potential of expert diversification through data in continual learning.

Type of Publication: publication

Title of Journal: International Conference on Learning Representations (ICLR), Vienna Austria, 7-11.05.2024

Authors: Grzegorz Rypesc; Sebastian Cygert; Valeriya Khan; Tomasz Trzcínski; Bartosz Zielínski; Bartłomiej Twardowski

Resurrecting Old Classes with New Data for Exemplar-Free Continual Learning

Abstract:Continual learning methods are known to suffer from catastrophic forgetting, a phenomenon that is particularly hard to counter for methods that do not store exemplars of previous tasks. Therefore, to reduce potential drift in the feature extractor, existing exemplar-free methods are typically evaluated in settings where the first task is significantly larger than subsequent tasks. Their performance drops drastically in more challenging settings starting with a smaller first task. To address this problem of feature drift estimation for exemplar-free methods, we propose to adversarially perturb the current samples such that their embeddings are close to the old class prototypes in the old model embedding space. We then estimate the drift in the embedding space from the old to the new model using the perturbed images and compensate the prototypes accordingly. We exploit the fact that adversarial samples are transferable from the old to the new feature space in a continual learning setting. The generation of these images is simple and computationally cheap. We demonstrate in our experiments that the proposed approach better tracks the movement of prototypes in embedding space and outperforms existing methods on several standard continual learning benchmarks as well as on fine-grained datasets.

Type of Publication: publication

Title of Journal:The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , Seattle, USA, 17-21.06.2024

Authors: Otavian Pascu; Adriana Stan; Dan Oneata; Elisabeta Oneata; Horia Cucu

Towards generalisable and calibrated audio deepfake detection with self-supervised representations

Abstract:Generalisation—the ability of a model to perform well on unseen data—is crucial for building reliable deepfake detectors. However, recent studies have shown that the current audio deepfake models fall short of this desideratum. In this work we investigate the potential of pretrained self-supervised representations in building general and calibrated audio deepfake detection models. We show that large frozen representations coupled with a simple logistic regression classifier are extremely effective in achieving strong generalisation capabilities: compared to the RawNet2 model, this approach reduces the equal error rate from 30.9% to 8.8% on a benchmark of eight deepfake datasets, while learning less than 2k parameters. Moreover, the proposed method produces considerably more reliable predictions compared to previous approaches making it more suitable for realistic use.

Type of Publication: Conference paper

Title of Journal: Interspeech 2024

Authors: Octavian Pascu; Adriana Stan; Dan Oneata; Elisabeta Oneata; Horia Cucu