FOREST FIRE IMAGE CLASSIFICATION THROUGH DECENTRALIZED DNN INFERENCE
In the realm of Natural Disaster Management (NDM), timely communication with local authorities is paramount for an effective response. To achieve this, multi-agent systems play a pivotal role by proficiently identifying and categorizing various disasters. In the field of Distributed Deep Neural Network (D-DNN) inference, such approaches often require DNN nodes to transmit their results to the cloud for inference, or they necessitate the establishment of a fixed topology network to enable inference directly on the edge, a practice prone to security risks. In this work, we propose a decentralized inference strategy tailored for fire classification tasks. In this approach, individual DNN nodes communicate within a network and enhance their predictions by considering other DNN node inference outputs that contribute to improving their individual performance. The overall coordination of the system on a specific decision is achieved through a consensus protocol, which acts as a universally accepted inference rule adopted by all DNN nodes operating within the system. We present a comprehensive experimental analysis, of the forestfire classification task, focusing on enhancing both individual DNN node performance and the stability of the consensus protocol.
EDGEmergency: A Cloud-Edge Platform to Enable Pervasive Computing for Disaster Management
EDGEmergency is a platform designed for disaster management that can dynamically leverage the edge infrastructure potentially already present within the emergency perimeter. Edge devices, from IoT to smartphones, possess an increasingly significant computational capacity that can be exploited, by changing their behavior in real-time and creating a pervasive local environment, capable of adapting perfectly to the specific context of reference. EDGEmergency, in fact, allows the creation of a unified computation environment leveraging the Cloud-Edge-Client Continuum concept, through which a computation cluster with zero configurations is created on-the-fly. The platform thus allows the deployment of distributed microservices on existing edge devices, installed by default for other purposes, through a modular and incremental logic that has the role of adapting best to the needs of the individual emergency, through advanced tools for analysis and monitoring, using artificial intelligence.
Supporting the Natural Disaster Management Distributing Federated Intelligence over the Cloud-Edge Continuum: the TEMA Architecture
Natural disasters are more and more often present in our daily life. Many are the cases where these events affect people and economies. In this context, there is the need for a technological intervention in support of first responders, with solutions capable of make decisions on the disaster areas. Indeed, considering these scenarios are time-sensitive, the intention is moving the computation units closer to those areas. In this paper, we propose a computing continuum architecture for offloading distributed intelligences over cloud, edge and deep edge layers. Exploiting the federated learning paradigm, enables mobile and stationary devices to independently train local models, contributing to the creation of the global common mode.
Data Operational Driven AI-based Architecture for Natural Disaster Management
Natural disasters pose increasing threats to communities and economies worldwide, emphasizing the urgency for technological interventions to support first responders and decision-makers in affected areas. To address this need, we introduce a novel computing continuum architecture designed for efficient offloading of distributed intelligences across cloud, edge, and deep edge tiers. Our approach leverages an AI crosslayer framework, integrating service, network, and infrastructure management, to optimize decision-making processes in timesensitive disaster scenarios. By employing federated learning techniques, our architecture enables both mobile and stationary devices to autonomously train local models, contributing to the development of a comprehensive global common model. Through this collaborative approach, we aim to enhance the capabilities of disaster management systems, facilitating more effective responses to critical events.
Federated Learning on Raspberry Pi 4: A Comprehensive Power Consumption Analysis
Edge Computing, a rapidly evolving sector within information technology, redefines data processing and analysis by shifting it closer to the data source, away from centralized cloud servers. This paradigm promises substantial benefits for diverse applications. In the realm of Artificial Intelligence and Machine Learning, Federated Learning emerges as a pioneering technique that harnesses Edge Computing for statistical model training. Federated Learning presents numerous advantages over traditional centralized Machine Learning, including reduced latency, heightened privacy, and real-timedata processing. Nonetheless, it introduces concerns regarding energy consumption, particularly for battery-powered Edge devices designed for remote or harsh environments. This study provides a comprehensive assessment of power consumption within the context of Federated Learning operations. To achieve this, a Raspberry Pi 4 and an INA 219 current sensor are employed. Results show that, during communication operations, the power consumption of the target device increases from a minimum of 8% to a maximumof 32% with respect to its idle state. During the local training operations it increases respectively by up to 32% for a CNN model and by up to 40% for aRNN model.
Make Federated Learning a Standard in Robotics by Using ROS2
The use of the Federated Learning paradigm could be disruptive in robotics, where data are naturally distributed among teams of agents and centralizing them would increase latency and break privacy. Unfortunately there are a lack of robot oriented framework for federated learning that use state ofthe art machine learning libraries. ROS2 (Robot Operating Systems) is a standard de-facto in robotics for building upteams of robots in a multi-node fully distributed manner. In this paper we presents the integration of ROS2 with PyTorch allowing an easy training of a global machine learning model starting from a set of local datasets. We present the architecture, the used methodology and finally we discuss the experimentation results over a well-known public dataset.
When Robotics Meets Distributed Learning: the Federated Learning Robotic Network Framework
FedROS: The ROS Framework for Federated Learning on Mobile Edge Devices
Federated Learning is a computing paradigm that shift the concept of learning from a single to a distributed system. Many applications have been considered in literature, for example mobile edge computing. In this context, robotic is an emerging trend, which takes advantage in terms of infrastructure optimization, such as resource allocation and communication efficiency, as well as in business solutions. In this poster, we propose a novel framework for submitting FL jobs on ROS-based devices. The framework, called FedROS, composes the containers of FL clients and server ROS2 packages programmatically.
Layer-wise feedback propagation
In this paper, we present Layer-wise Feedback Propagation (LFP), a novel training approach for neural-network-like predictors that utilizes explainability, specifically Layer-wise Relevance Propagation (LRP), to assign rewards to individual connections based on their respective contributions to solving a given task. This differs from traditional gradient descent, which updates parameters towards an estimated loss minimum. LFP distributes a reward signal throughout the model without the need for gradient computations. It then strengthens structures that receive positive feedback while reducing the influence of structures that receive negative feedback. We establish the convergence of LFP theoretically and empirically, and demonstrate its effectiveness in achieving comparable performance to gradient descent on various models and datasets. Notably, LFP overcomes certain limitations associated with gradient-based methods, such as reliance on meaningful derivatives. We further investigate how the different LRP-rules can be extended to LFP, what their effects are on training, as well as potential applications, such as training models with no meaningful derivatives, e.g., step- function activated Spiking Neural Networks (SNNs), or for transfer learning, to efficiently utilize existing knowledge.
DualView: Data Attribution from the Dual Perspective
Local data attribution (or influence estimation) techniques aim at estimating the impact that individual data points seen during training have on particular predictions of an already trained Machine Learning model during test time. Previous methods either do not perform well consistently across different evaluation criteria from literature, are characterized by a high computational demand, or suffer from both. In this work we present DualView, a novel method for post-hoc data attribution based on surrogate modelling, demonstrating both high computational efficiency, as well as good evaluation results. With a focus on neural networks, we evaluate our proposed technique using suitable quantitative evaluation strategies from the literature against related prin-
cipal local data attribution methods. We find that DualView requires considerably lower computational resources than other methods, while demonstrating comparable performance to competing approaches across evaluation metrics. Futhermore,
our proposed method produces sparse explanations, where sparseness can be tuned via a hyperparameter. Finally, we showcase that with DualView, we can now render explanations from local data attributions compatible with established
local feature attribution methods: For each prediction on (test) data points explained in terms of impactful samples from the training set, we are able to compute and visualize how the prediction on (test) sample relates to each influential train-
ing sample in terms of features recognized and by the model. We provide an Open Source implementation of DualView online1 , together with implementations for all other local data attribution methods we compare against, as well as the metrics reported here, for full reproducibility.
Explainable AI for Time Series via Virtual Inspection Layers
The field of eXplainable Artificial Intelligence (XAI) has greatly advanced in recent years, but progress has mainly been made in computer vision and natural language processing. For time series, where the input is often not interpretable, only limited research on XAI is available. In this work, we put forward a virtual inspection layer, that transforms the time series to an interpretable representation and allows to propagate relevance attributions to this representation via local XAI methods like layer-wise relevance propagation (LRP). In this way, we extend the applicability of a family of XAI methods to domains (e.g. speech) where the input is only interpretable after a transformation. Here, we focus on the Fourier transformation which is prominently applied in the interpretation of time series and LRP and refer to our method as DFT-LRP. We demonstrate the usefulness of DFT-LRP in various time series classification settings like audio and electronic health records. We showcase how DFT-LRP reveals differences in the classification strategies of models trained in different domains (e.g., time vs. frequency domain) or helps to discover how models act on spurious correlations in the data.
Reactive Model Correction: Mitigating Harm to Task-Relevant Features via Conditional Bias Suppression
Deep Neural Networks are prone to learning and relying on spurious correlations in the training data, which, for high-risk applications, can have fatal consequences. Various approaches to suppress model reliance on harmful features have been proposed that can be applied post-hoc without additional training. Whereas those methods can be applied with efficiency, they also tend to harm model performance by globally shifting the distribution of latent features. To mitigate unintended overcorrection of model behavior, we propose a reactive approach conditioned on model-derived knowledge and eXplainable Artificial Intel-
ligence (XAI) insights. While the reactive approach can be applied to many post-hoc methods, we demonstrate the incorporation of reactivity in particular for P-ClArC (Projective Class Artifact Compensation), introducing a new method called R-ClArC (Reactive Class Artifact Compensation). Through rigorous experiments in controlled settings (FunnyBirds) and with a real-world dataset (ISIC2019), we show that introducing reactivity can minimize the detrimental effect of the applied correction while simultaneously ensuring low reliance on spurious features.
AudioMNIST: Exploring Explainable Artificial Intelligence for Audio Analysis on a Simple Benchmark
Explainable Artificial Intelligence (XAI) is targeted at understanding how models perform feature selection and derive their classification decisions. This paper explores post-hoc explanations for deep neural networks in the audio domain. Notably, we present a novel Open Source audio dataset consisting of 30,000 audio samples of English spoken digits which we use for classification tasks on spoken digits and speakers’ biological sex. We use the popular XAI technique Layer-wise Relevance Propagation (LRP) to identify relevant features for two neural network architectures that process either waveform or spectrogram representations of the data. Based on the relevance scores obtained from LRP, hypotheses about the neural networks’ feature selection are derived and subsequently tested through systematic manipulations of the input data. Further, we take a step beyond visual explanations and introduce audible heatmaps. We demonstrate the superior interpretability of audible explanations over visual ones in a human user study.
Explaining Predictive Uncertainty by Exposing Second-Order Effects
Explainable AI has brought transparency into complex ML blackboxes, enabling, in particular, to identify which features these models use for their predictions. So far, the question of explaining predictive uncertainty, i.e. why a model ‘doubts’, has been scarcely studied. Our investigation reveals that predictive uncertainty is dominated by second-order effects, involving single features or product interactions between them. We contribute a new method for explaining predictive uncertainty based on these second-order effects. Computationally, our method reduces to a simple covariance computation over a collection of first-order explanations. Our method is generally applicable, allowing for turning common attribution techniques (LRP, Gradient × Input, etc.) into powerful second-order uncertainty explainers, which we call CovLRP, CovGI, etc. The accuracy of the explanations our method produces is demonstrated through systematic quantitative evaluations, and the overall usefulness of our method is demonstrated via two practical showcases.
Human-Centered Evaluation of XAI Methods
In the ever-evolving field of Artificial Intelligence, a critical challenge has been to decipher the decision-making processes within the so-called ”black boxes” in deep learning. Over recent years, a plethora of methods have emerged, dedicated to explaining decisions across diverse tasks. Particularly in tasks like image classification, these methods typically identify and emphasize the pivotal pixels that most influence a classifier’s prediction. Interestingly, this approach mirrors human behavior: when asked to explain our rationale for classifying an image, we often point to the most salient features or aspects. Capitalizing on this parallel, our research embarked on a user-centric study. We sought to objectively measure the interpretability of three leading explanation methods: (1) Prototypical Part Network, (2) Occlusion, and (3) Layer-wise Relevance Propagation. Intriguingly, our results highlight that while the regions spotlighted by these methods can vary widely, they all offer humans a nearly equivalent depth of understanding. This enables users to discern and categorize images efficiently, reinforcing the value of these methods in enhancing AI transparency.
From Hope to Safety: Unlearning Biases of Deep Models via Gradient Penalization in Latent Space
Deep Neural Networks are prone to learning spurious correlations embedded in the training data, leading to potentially biased predictions. This poses risks when deploying these models for high-stake decision-making, such as in medical applications. Current methods for post-hoc model correction either require input-level annotations which are only possible for spatially localized biases, or augment the latent feature space, thereby hoping to enforce the right reasons. We present a novel method for model correction on the concept level that explicitly reduces model sensitivity towards biases via gradient penalization. When modeling biases via Concept Activation Vectors, we highlight the importance of choosing robust directions, as traditional regression-based approaches such as Support Vector Machines tend to result in diverging directions. We effectively mitigate biases in controlled and real-world settings on the ISIC, Bone Age, ImageNet and CelebA datasets using VGG, ResNet and EfficientNet architectures.
Understanding the (Extra-)Ordinary: Validating Deep Model Decisions with Prototypical Concept-based Explanations
Ensuring both transparency and safety is critical when deploying Deep Neural Networks (DNNs) in high-risk applications, such as medicine. The field of explainable AI (XAI) has proposed various methods to comprehend the decision-making processes of opaque DNNs. However, only few XAI methods are suitable of ensuring safety in practice as they heavily rely on repeated labor-intensive and possibly biased human assessment. In this work, we present a novel post-hoc concept-based XAI framework that conveys besides instance-wise (local) also class-wise (global) decision-making strategies via prototypes. What sets our approach apart is the combination of local and global strategies, enabling a clearer understanding of the (dis-)similarities in model decisions compared to the expected (prototypical) concept use, ultimately reducing the dependence on human long-term assessment. Quantifying the deviation from prototypical behavior not only allows to associate predictions with specific model sub-strategies
but also to detect outlier behavior. As such, our approach constitutes an intuitive and explainable tool for model validation. We demonstrate the effectiveness of our approach in identifying out-of-distribution samples, spurious model behavior and data quality issues across three datasets (ImageNet, CUB-200, and CIFAR-10) utilizing VGG, ResNet, and EfficientNet architectures.
PURE: Turning Polysemantic Neurons Into Pure Features by Identifying Relevant Circuits
The field of mechanistic interpretability aims to study the role of individual neurons in Deep Neural Networks. Single neurons, however, have the capability to act poly-semantically and encode for multiple (unrelated) features, which renders their interpretation difficult. We present a method for disentangling polysemanticity of any Deep Neural Network by decomposing a polysemantic neuron into multiple monosemantic “virtual” neurons. This is achieved by identifying the relevant sub-graph (“circuit”) for each “pure” feature. We demonstrate how our approach allows us to find and disentangle various polysemantic units
of ResNet models trained on ImageNet. While evaluating feature visualizations using CLIP, our method effectively disentangles representations, improving upon methods based on neuron activations.
XAI-based Comparison of Input Representations for Audio Event Classification
Deep neural networks are a promising tool for Audio Event Classification. In contrast to other data like natural images, there are many sensible and non-obvious representations for audio data, which could serve as input to these models. Due to their black-box nature, the effect of different input representations has so far mostly been investigated by measuring classification performance. In this work, we leverage eXplainable AI (XAI), to understand the underlying classification strategies of models trained on different input representations. Specifically, we compare two model architectures with regard to relevant input features used for Audio Event Detection: one directly processes the signal as the raw waveform, and the other takes in its time-frequency spectrogram representation. We show how relevance heatmaps obtained via "Siren"Layer-wise Relevance Propagation uncover representation-dependent decision strategies. With these insights, we can make a well-informed decision about the best input representation in terms of robustness and representativity and confirm that the model’s classification strategies align with human requirements.
The Meta-Evaluation Problem in Explainable AI: Identifying Reliable Estimators with MetaQuantus
One of the unsolved challenges in the field of Explainable AI (XAI) is determining how to most reliably estimate the quality of an explanation method in the absence of ground truth explanation labels. Resolving this issue is of utmost importance as the evaluation outcomes generated by competing evaluation methods (or “quality estimators”), which aim at measuring the same property of an explanation method, frequently present conflicting rankings. Such disagreements can be challenging for practitioners to interpret, thereby complicating their ability to select the best-performing explanation method. We address this problem through a meta-evaluation of different quality estimators in XAI, which we define as “the process of evaluating the evaluation method”. Our novel framework, MetaQuantus, analyses two complementary performance characteristics of a quality estimator: its resilience to noise and reactivity to randomness, thus circumventing the need for ground truth labels. We demonstrate the effectiveness of our framework through a series of experiments, targeting various open questions in XAI such as the selection and hyperparameter optimisation of quality estimators. Our work is released under an open-source license1 to serve as a development tool for XAI- and Machine Learning (ML) practitioners to verify and benchmark newly constructed quality estimators in a given explainability context. With this work, we provide the community with clear and theoretically-grounded guidance for identifying reliable evaluation methods, thus facilitating reproducibility in the field.
A Fresh Look at Sanity Checks for Saliency Maps
The Model Parameter Randomisation Test (MPRT) is highly recognised in the eXplainable Artificial Intelligence (XAI) community due to its fundamental evaluative criterion: explanations should be sensitive to the parameters of the model they seek to explain. However, recent studies have raised several methodological concerns for the empirical interpretation of MPRT. In response, we propose two modifications to the original test: Smooth MPRT and Efficient MPRT. The former reduces the impact of noise on evaluation outcomes via sampling, while the latter avoids the need for biased similarity measurements by re-interpreting the test through the increase in explanation complexity after full model randomisation. Our experiments show that these modifications enhance the metric reliability, facilitating a more trustworthy deployment of explanation methods.
Explainable concept mappings of MRI: Revealing the mechanisms underlying deep learning-based brain disease classification
Motivation. While recent studies show high accuracy in the classification of Alzheimer’s disease using deep neural networks, the underlying learned concepts have not been investigated.
Goals. To systematically identify changes in brain regions through concepts learned by the deep neural network for model validation.
Approach. Using quantitative R2* maps we separated Alzheimer’s patients (n=117) from normal controls (n=219) by using a convolutional neural network and systematically investigated the learned concepts using Concept Relevance Propagation and compared these results to a conventional region of interest-based analysis.
Results. In line with established histological findings and the region of interest-based analyses, highly relevant concepts were primarily found in and adjacent to the basal ganglia.
Impact. The identification of concepts learned by deep neural networks for disease classification enables validation of the models and could potentially improve reliability.
Detection and Estimation of Gas Sources with Arbitrary Locations based on Poisson's Equation
Accurate estimation of the number and locations of dispersed material sources is critical for optimal disaster response in Chemical, Biological, Radiological, or Nuclear accidents. This paper introduces a novel approach to Gas Source Localization that uses sparse Bayesian learning adapted to models based on Partial Differential Equations for modeling gas dynamics. Using the method of Green’s functions and the adjoint state method, a gradient-based optimization with respect to source location is derived, allowing superresolving (arbitrary) source locations. By combing the latter with sparse Bayesian learning, a sparse source support can be identified, thus indirectly assessing the number of sources. Simulation results and comparisons with classical sparse estimators for linear models demonstrate the effectiveness of the proposed approach. The proposed sparsity-constrained gas source localization method offers thus a flexible solution for disaster response and robotic exploration in hazardous environments.
Evaluating Deep Neural Network-based Fire Detection for Natural Disaster Management
Recently, climate change has led to more frequent extreme weather events, introducing new challenges for Natural Disaster Management (NDM) organizations. This fact makes the employment of modern technological tools such as Deep Neural Networks-based fire detectors a necessity, as they can assist such organizations manage these extreme events more effectively. In this work, we argue that the mean Average Precision (mAP) metric that is commonly used to evaluate typical object detection algorithms can not be trusted for the fire detection task, due to its high dependence on the employed data annotation strategy. This means that the mAP score of a fire detection algorithm may be low even when it predicts fire bounding boxes that accurately enclose the depicted fires. In this direction, a new evaluation metric for fire detection is proposed, denoted as Image-level mean Average Precision (ImAP), which reduces the dependence on the bounding box annotation strategy by rewarding/penalizing bounding box predictions on image level, rather than on bounding box level. Experiments using different object detection algorithms have shown that the proposed ImAP metric reveals the true fire detection capabilities of the tested algorithms more effectively.
Methodology and elaboration of model for Map of Wildfire Risk
For civil protection purposes, risk is the probability of a calamitous event occurring that may cause harmful effects on the population, residential and productive settlements and infrastructure, within a particular area, in a given period of time. The work was carried out with the aim of being able to establish the municipal fire danger and risk index (IR), which define, respectively, the degree of danger and fire risk calculated on a regional basis and referred to the individual municipal territory, exploiting the typical functions of GIS tools and the new steps forward made by the application of artificial intelligence; however, the horizon to be reached is to be able to transform the algorithms into automated processes that can be used in platforms capable of returning outputs to end users.
3D-Flood Dataset
The Aristotle University of Thessaloniki (hereinafter, AUTH) created the following dataset, entitled ‘3D-Flood’, within the context of the project TEMA that was funded by the European Commission-European Union.
The dataset will be used for the construction of a 3D model regarding the district of Agios Thomas in Larisa, Greece, after the flood events of 2023. It is comprised of 795 UAV video frames, taken from 4 YouTube videos.
We provide the links for each YouTube video, along with the frame numbers that we kept for each video.
Details on acquiring the dataset can be found here.
Flood Master Dataset
Our Master Flood Dataset consists of flood images picked from different publicly available datasets. The origins of the images is specified in the "sources.csv" file.
The dataset consists of 282 train, 87 validation and 1973 test frames. We provide the frames from the sourced videos and segmentation masks of the flooded areas.
Details on acquiring the dataset can be found here.
Blaze Fire Classification – Segmentation Dataset
The dataset is destined to be used for wildfire image classification and burnt area segmentation tasks for Unmanned Aerial Vehicles. It is comprised of 5,408 frames of aerial views taken from 56 videos and 2 public datasets. From the D-Fire public dataset, 829 photographs were used; and from the Burned Area UAV public dataset 34 images were used. For the classification task, there are 5 classes (‘Burnt’, ‘Half-Burnt’, ’Non-Burnt’, ‘Fire’, ‘Smoke’). As for the segmentation task, 404 segmentation masks on a subset have been created, which assign to each pixel of the image the class ‘burnt’ or the class ‘non-burnt’.
Details on acquiring the dataset can be found here.
Mastodon Posts Dataset
The dataset comprises of 766 social media posts in Greek language from the platform “Mastodon” spanning the 2023 wildfires in Greece. Each post was annotated internally with Plutchik-8 emotions.
Details on acquiring the dataset can be found here.