Publications about the project
Leveraging Collective Knowledge for Forest Fire Classification
This paper presents a novel Fire Classification Multi-Agent (FCMA) framework that utilizes peer-to-peer learning and distributed learning techniques to disseminate knowledge within the agent community. Furthermore, we define and introduce the architecture of a Deep Neural Network (DNN) agent, which can infinitely interact with other DNN agents and the external environment upon deployment. The FCMA framework is suitable for natural disaster management systems where multiple agents are required to run autonomously and foster the community’s knowledge. The FCMA provides two options for knowledge transfer, a peer-to-peer and a federated one. The experimental results display the effective knowledge transfer using both options and also compare the two options with each other in a forest fire classification setting.
An Aspect-Based Emotion Analysis Approach on Wildfire-Related Geo-Social Media Data—A Case Study of the 2020 California Wildfires
Natural disasters like wildfires pose significant threats to communities, which necessitates timely and effective disaster response strategies. While Aspect-based Sentiment Analysis (ABSA) has been widely used to extract sentiment-related information at the sub-sentence level, the corresponding field of Aspect-based Emotion Analysis (ABEA) remains underexplored due to dataset limitations and the increased complexity of emotion classification. In this study, we used EmoGRACE, a fine-tuned BERT-based model for ABEA, which we applied to georeferenced tweets of the 2020 California wildfires. The results for this case study reveal distinct spatio-temporal emotion patterns for wildfire-related aspect terms, with fear and sadness increasing near wildfire perimeters. This study demonstrates the feasibility of tracking emotion dynamics across disaster-affected regions and highlights the potential of ABEA in real-time disaster monitoring. The results suggest that ABEA can provide a nuanced understanding of public sentiment during crises for policymakers.
Enhancing Disaster Response with Social Media Analytics
An aspect-based emotion analysis approach on wildfire-related geo-social media data
Cloud Learning-by-Education Node Community (C-LENC) framework
The Learning-by-Education Node Community (LENC) paradigm has laid out the foundation for examining the interaction rules between collaborative nodes performing knowledge exchange, based on evaluating their knowledge. As a research prototype, Out-Of-Distribution detectors were employed for knowledge (self-) assessment and knowledge distillation for model updates, all operating on the same computer. This paper presents an extension to the LENC paradigm, the novel Cloud Learning-by-Education Node Community (C-LENC) framework, that can be used to perform a plethora of collaborative distributed machine learning workflows between the nodes of the C-LENC network, operating on the cloud. It leverages a succinct communication protocol, exposing a small number of commands the C-LENC nodes can utilize to exchange information. We demonstrate the usage of C-LENC on four distributed machine learning workflows, including learning a new task, federated learning, multi-teacher knowledge distillation and distributed inference. We also present the internal architecture of a LENC node, as well as the procedure for utilizing Out-Of-Distribution detectors to query the C-LENC network and finding suitable CLENC nodes for any task. We present the limitations as well as potential directions to further improve the C-LENC framework.
Proto-SVDD: Decentralized Federated Object Detection with Prototype-Based Communication
Federated Learning (FL) methods typically require Deep Neural Network (DNN) weight transfer from FL clients to an FL aggregator (master) for centralized DNN aggregation. However, this may not be possible under strict privacy or network constraints. In this paper, we present Proto-SVDD, a fully decentralized federated learning framework for enabling collaborative DNN model training, without neural parameter sharing. Instead, Proto-SVDD employs a lightweight class-wise prototype learning mechanism based on Support Vector Data Description (SVDD), which is trained by each FL client DNN using their own local, private data. Collaboration between FL clients involves exchanging and aggregating only their SVDD class prototypes in a fully decentralized topology. Neither DNN weights nor training data are exchanged between FL nodes whatsoever. Proto-SVDD has been evaluated for the object detection task, under various DNN client configurations, demonstrating competitive object detection accuracy with a significantly lower communication cost compared with stateof-the-art prototype-based FL methods. Experimental results show that Proto-SVDD federated learning enables efficient object detection in limited resources and privacy constraint settings.
Software for Dataset-wide XAI: From Local Explanations to Global Insights with Zennit, CoRelAy, and ViRelAy
Deep Neural Networks (DNNs) are known to be strong predictors, but their prediction strategies can rarely be understood. With recent advances in Explainable Artificial Intelligence (XAI), approaches are available to explore the reasoning behind those complex models' predictions. Among post-hoc attribution methods, Layer-wise Relevance Propagation (LRP) shows high performance. For deeper quantitative analysis, manual approaches exist, but without the right tools they are unnecessarily labor intensive. In this software paper, we introduce three software packages targeted at scientists to explore model reasoning using attribution approaches and beyond: (1) Zennit - a highly customizable and intuitive attribution framework implementing LRP and related approaches in PyTorch, (2) CoRelAy - a framework to easily and quickly construct quantitative analysis pipelines for dataset-wide analyses of explanations, and (3) ViRelAy - a web-application to interactively explore data, attributions, and analysis results. With this, we provide a standardized implementation solution for XAI, to contribute towards more reproducibility in our field.
See What I Mean? CUE: A Cognitive Model of Understanding Explanations
As machine learning systems increasingly inform critical decisions, the need for human-understandable explanations grows. Current evaluations of Explainable AI (XAI) often prioritize technical fidelity over cognitive accessibility which critically affects users, in particular those with visual impairments. We propose CUE, a model for Cognitive Understanding of Explanations, linking explanation properties to cognitive sub-processes: legibility (perception), readability (comprehension), and interpretability (interpretation). In a study (N=455) testing heatmaps with varying colormaps (BWR, Cividis, Coolwarm), we found comparable task performance but lower confidence/effort for visually impaired users. Unlike expected, these gaps were not mitigated and sometimes worsened by accessibility-focused color maps like Cividis. These results challenge assumptions about perceptual optimization and support the need for adaptive XAI interfaces. They also validate CUE by demonstrating that altering explanation legibility affects understandability. We contribute: (1) a formalized cognitive model for explanation understanding, (2) an integrated definition of human-centered explanation properties, and (3) empirical evidence motivating accessible, user-tailored XAI.
From What to How: Attributing CLIP's Latent Components Reveals Unexpected Semantic Reliance
Transformer-based CLIP models are widely used for text-image probing and feature extraction, making it relevant to understand the internal mechanisms behind their predictions. While recent works show that Sparse Autoencoders (SAEs) yield interpretable latent components, they focus on what these encode and miss how they drive predictions. We introduce a scalable framework that reveals what latent components activate for, how they align with expected semantics, and how important they are to predictions. To achieve this, we adapt attribution patching for instance-wise component attributions in CLIP and highlight key faithfulness limitations of the widely used Logit Lens technique. By combining attributions with semantic alignment scores, we can automatically uncover reliance on components that encode semantically unexpected or spurious concepts. Applied across multiple CLIP variants, our method uncovers hundreds of surprising components linked to polysemous words, compound nouns, visual typography and dataset artifacts. While text embeddings remain prone to semantic ambiguity, they are more robust to spurious correlations compared to linear classifiers trained on image embeddings. A case study on skin lesion detection highlights how such classifiers can amplify hidden shortcuts, underscoring the need for holistic, mechanistic interpretability.
Relevance-driven Input Dropout: an Explanation-guided Regularization Technique
Overfitting is a well-known issue extending even to state-of-the-art (SOTA) Machine Learning (ML) models, resulting in reduced generalization, and a significant train-test performance gap. Mitigation measures include a combination of dropout, data augmentation, weight decay, and other regularization techniques. Among the various data augmentation strategies, occlusion is a prominent technique that typically focuses on randomly masking regions of the input during training. Most of the existing literature emphasizes randomness in selecting and modifying the input features instead of regions that strongly influence model decisions. We propose Relevance-driven Input Dropout (RelDrop), a novel data augmentation method which selectively occludes the most relevant regions of the input, nudging the model to use other important features in the prediction process, thus improving model generalization through informed regularization. We further conduct qualitative and quantitative analyses to study how Relevance-driven Input Dropout (RelDrop) affects model decision-making. Through a series of experiments on benchmark datasets, we demonstrate that our approach improves robustness towards occlusion, results in models utilizing more features within the region of interest, and boosts inference time generalization performance.