Publications about the project
Cloud Learning-by-Education Node Community (C-LENC) framework
The Learning-by-Education Node Community (LENC) paradigm has laid out the foundation for examining the interaction rules between collaborative nodes performing knowledge exchange, based on evaluating their knowledge. As a research prototype, Out-Of-Distribution detectors were employed for knowledge (self-) assessment and knowledge distillation for model updates, all operating on the same computer. This paper presents an extension to the LENC paradigm, the novel Cloud Learning-by-Education Node Community (C-LENC) framework, that can be used to perform a plethora of collaborative distributed machine learning workflows between the nodes of the C-LENC network, operating on the cloud. It leverages a succinct communication protocol, exposing a small number of commands the C-LENC nodes can utilize to exchange information. We demonstrate the usage of C-LENC on four distributed machine learning workflows, including learning a new task, federated learning, multi-teacher knowledge distillation and distributed inference. We also present the internal architecture of a LENC node, as well as the procedure for utilizing Out-Of-Distribution detectors to query the C-LENC network and finding suitable CLENC nodes for any task. We present the limitations as well as potential directions to further improve the C-LENC framework.
Proto-SVDD: Decentralized Federated Object Detection with Prototype-Based Communication
Federated Learning (FL) methods typically require Deep Neural Network (DNN) weight transfer from FL clients to an FL aggregator (master) for centralized DNN aggregation. However, this may not be possible under strict privacy or network constraints. In this paper, we present Proto-SVDD, a fully decentralized federated learning framework for enabling collaborative DNN model training, without neural parameter sharing. Instead, Proto-SVDD employs a lightweight class-wise prototype learning mechanism based on Support Vector Data Description (SVDD), which is trained by each FL client DNN using their own local, private data. Collaboration between FL clients involves exchanging and aggregating only their SVDD class prototypes in a fully decentralized topology. Neither DNN weights nor training data are exchanged between FL nodes whatsoever. Proto-SVDD has been evaluated for the object detection task, under various DNN client configurations, demonstrating competitive object detection accuracy with a significantly lower communication cost compared with stateof-the-art prototype-based FL methods. Experimental results show that Proto-SVDD federated learning enables efficient object detection in limited resources and privacy constraint settings.
Software for Dataset-wide XAI: From Local Explanations to Global Insights with Zennit, CoRelAy, and ViRelAy
Deep Neural Networks (DNNs) are known to be strong predictors, but their prediction strategies can rarely be understood. With recent advances in Explainable Artificial Intelligence (XAI), approaches are available to explore the reasoning behind those complex models' predictions. Among post-hoc attribution methods, Layer-wise Relevance Propagation (LRP) shows high performance. For deeper quantitative analysis, manual approaches exist, but without the right tools they are unnecessarily labor intensive. In this software paper, we introduce three software packages targeted at scientists to explore model reasoning using attribution approaches and beyond: (1) Zennit - a highly customizable and intuitive attribution framework implementing LRP and related approaches in PyTorch, (2) CoRelAy - a framework to easily and quickly construct quantitative analysis pipelines for dataset-wide analyses of explanations, and (3) ViRelAy - a web-application to interactively explore data, attributions, and analysis results. With this, we provide a standardized implementation solution for XAI, to contribute towards more reproducibility in our field.
See What I Mean? CUE: A Cognitive Model of Understanding Explanations
As machine learning systems increasingly inform critical decisions, the need for human-understandable explanations grows. Current evaluations of Explainable AI (XAI) often prioritize technical fidelity over cognitive accessibility which critically affects users, in particular those with visual impairments. We propose CUE, a model for Cognitive Understanding of Explanations, linking explanation properties to cognitive sub-processes: legibility (perception), readability (comprehension), and interpretability (interpretation). In a study (N=455) testing heatmaps with varying colormaps (BWR, Cividis, Coolwarm), we found comparable task performance but lower confidence/effort for visually impaired users. Unlike expected, these gaps were not mitigated and sometimes worsened by accessibility-focused color maps like Cividis. These results challenge assumptions about perceptual optimization and support the need for adaptive XAI interfaces. They also validate CUE by demonstrating that altering explanation legibility affects understandability. We contribute: (1) a formalized cognitive model for explanation understanding, (2) an integrated definition of human-centered explanation properties, and (3) empirical evidence motivating accessible, user-tailored XAI.
From What to How: Attributing CLIP's Latent Components Reveals Unexpected Semantic Reliance
Transformer-based CLIP models are widely used for text-image probing and feature extraction, making it relevant to understand the internal mechanisms behind their predictions. While recent works show that Sparse Autoencoders (SAEs) yield interpretable latent components, they focus on what these encode and miss how they drive predictions. We introduce a scalable framework that reveals what latent components activate for, how they align with expected semantics, and how important they are to predictions. To achieve this, we adapt attribution patching for instance-wise component attributions in CLIP and highlight key faithfulness limitations of the widely used Logit Lens technique. By combining attributions with semantic alignment scores, we can automatically uncover reliance on components that encode semantically unexpected or spurious concepts. Applied across multiple CLIP variants, our method uncovers hundreds of surprising components linked to polysemous words, compound nouns, visual typography and dataset artifacts. While text embeddings remain prone to semantic ambiguity, they are more robust to spurious correlations compared to linear classifiers trained on image embeddings. A case study on skin lesion detection highlights how such classifiers can amplify hidden shortcuts, underscoring the need for holistic, mechanistic interpretability.
Relevance-driven Input Dropout: an Explanation-guided Regularization Technique
Overfitting is a well-known issue extending even to state-of-the-art (SOTA) Machine Learning (ML) models, resulting in reduced generalization, and a significant train-test performance gap. Mitigation measures include a combination of dropout, data augmentation, weight decay, and other regularization techniques. Among the various data augmentation strategies, occlusion is a prominent technique that typically focuses on randomly masking regions of the input during training. Most of the existing literature emphasizes randomness in selecting and modifying the input features instead of regions that strongly influence model decisions. We propose Relevance-driven Input Dropout (RelDrop), a novel data augmentation method which selectively occludes the most relevant regions of the input, nudging the model to use other important features in the prediction process, thus improving model generalization through informed regularization. We further conduct qualitative and quantitative analyses to study how Relevance-driven Input Dropout (RelDrop) affects model decision-making. Through a series of experiments on benchmark datasets, we demonstrate that our approach improves robustness towards occlusion, results in models utilizing more features within the region of interest, and boosts inference time generalization performance.
Pruning By Explaining Revisited: Optimizing Attribution Methods to Prune CNNs and Transformers
To solve ever more complex problems, Deep Neural Networks are scaled to billions of parameters, leading to huge computational costs. An effective approach to reduce computational requirements and increase efficiency is to prune unnecessary components of these often over-parameterized networks. Previous work has shown that attribution methods from the field of eXplainable AI serve as effective means to extract and prune the least relevant network components in a few-shot fashion. We extend the current state by proposing to explicitly optimize hyperparameters of attribution methods for the task of pruning, and further include transformer-based networks in our analysis. Our approach yields higher model compression rates of large transformer- and convolutional architectures (VGG, ResNet, ViT) compared to previous works, while still attaining high performance on ImageNet classification tasks. Here, our experiments indicate that transformers have a higher degree of over-parameterization compared to convolutional neural networks.
Distributed Superresolution Gas Source Localization Based on Poisson Equation
Accurate modeling and estimation of airborne material in Chemical, Biological, Radiological, or Nuclear accidents are vital for effective disaster response. In this paper a method that combines prior domain knowledge in terms of Partial Differential Equations (PDEs), sparse Bayesian learning (SBL), and cooperative estimation for multiple robots or sensor networks is proposed to identify the number and locations of gas sources. Using method of Green’s functions and the adjoint state method, a gradient-based optimization with respect to source location is derived, allowing superresolving (arbitrary) source locations. By combing the latter with SBL, a sparse source support can be identified, thus indirectly assessing the number of sources. Both steps are computed cooperatively, utilizing the agent network to share information. Simulation results demonstrate the effectiveness of the approach.
Beyond Scalars: Concept-Based Alignment Analysis in Vision Transformers
Vision transformers (ViTs) can be trained using various learning paradigms, from fully supervised to self-supervised. Diverse training protocols often result in significantly different feature spaces, which are usually compared through alignment analysis. However, current alignment measures quantify this relationship in terms of a single scalar value, obscuring the distinctions between common and unique features in pairs of representations that share the same scalar alignment. We address this limitation by combining alignment analysis with concept discovery, which enables a breakdown of alignment into single concepts encoded in feature space. This fine-grained comparison reveals both universal and unique concepts across different representations, as well as the internal structure of concepts within each of them. Our methodological contributions address two key prerequisites for concept-based alignment: 1) For a description of the representation in terms of concepts that faithfully capture the geometry of the feature space, we define concepts as the most general structure they can possibly form - arbitrary manifolds, allowing hidden features to be described by their proximity to these manifolds. 2) To measure distances between concept proximity scores of two representations, we use a generalized Rand index and partition it for alignment between pairs of concepts. We confirm the superiority of our novel concept definition for alignment analysis over existing linear baselines in a sanity check. The concept-based alignment analysis of representations from four different ViTs reveals that increased supervision correlates with a reduction in the semantic structure of learned representations.
Post-hoc Concept Disentanglement: From Correlated to Isolated Concept Representations
Concept Activation Vectors (CAVs) are widely used to model human-understandable concepts as directions within the latent space of neural networks. They are trained by identifying directions from the activations of concept samples to those of non-concept samples. However, this method often produces similar, non-orthogonal directions for correlated concepts, such as “beard” and “necktie” within the CelebA dataset, which frequently co-occur in images of men. This entanglement complicates the interpretation of concepts in isolation and can lead to undesired effects in CAV applications, such as activation steering. To address this issue, we introduce a post-hoc concept disentanglement method that employs a non-orthogonality loss, facilitating the identification of orthogonal concept directions while preserving directional correctness. We evaluate our approach with real-world and controlled correlated concepts in CelebA and a synthetic FunnyBirds dataset with VGG16 and ResNet18 architectures. We further demonstrate the superiority of orthogonalized concept representations in activation steering tasks, allowing (1) the insertion of isolated concepts into input images through generative models and (2) the removal of concepts for effective shortcut suppression with reduced impact on correlated concepts in comparison to baseline CAVs. (Code is available at https://github.com/erenerogullari/cav-disentanglement.)