Datasets and Publications

TEMA AIIA_wildfire Dataset

Apostolidis, Apostolos; Zamioudis, Alexandros; Pitas, Ioannis

Publication date: 16/03/2026 - DOI: 10.5281/zenodo.19048073

General description of the dataset

The TEMA AIIA_wildfire dataset is a collection of 2,236 natural disaster images designed for semantic segmentation, focusing on burnt areas, smoke, and fire. It aggregates and standardizes images from three distinct sources: the BLAZE classification dataset (https://aiia.csd.auth.gr/blaze-fire-classification-segmentation-dataset/), Finland’s wildfire dataset and Sardegna’s wildfire dataset. If one uses any part of these datasets in his/her work, he/she is kindly asked to cite the following paper:

M. Siavrakas, C. Papaioannidis and I.Pitas, "BLAZE: A dataset for wildfire and burnt area UAV image classification and segmentation", IEEE International Conference on Image Processing (ICIP), Anchorage, Alaska, USA, 13-17 September, 2025.
https://link.springer.com/article/10.1007/s00521-022-07467-z
https://link.springer.com/article/10.1007/s00521-023-08260-2
https://www.sciencedirect.com/science/article/pii/S0924271623001831
https://zenodo.org/records/7944963

Dataset Structure

The dataset is organized by source, each with standard train/validation splits containing .jpg images and corresponding .png label masks. The corresponding folder for each source are BLAZE (BLAZE1), Finland’s (KAHY) and Sardegna’s (RAS). Labels follow a four-class hierarchy (0: background, 1: burnt, 2: smoke, 3: fire). The final composition is 984 images from BLAZE ( 584 from KAHY, and 668 from RAS, split into 1,527 training and 655 validation images almost a 70 – 30% split.

Details on acquiring the dataset can be found here.

Synthetic Images Flood Scenario

Atos (Spain); Diaz Fragachan, Eleonor

Publication date: 28/01/2026 - DOI: 10.5281/zenodo.18400720

Synthetic RGB image dataset depicting 766 images of floods in varied scenarios generated with AI methods.
This data can be used to train/evaluate flood segmentation models on RGB images.

FloodsScenarios/
├── images/ (image_00001_.jpg → image_00766_.jpg)
├── annotated_images/ (image_00001_.png → image_00766_.png)
└── annotated_masks/ (image_00001_.png → image_00766_.png)

Synthetic Images Fire Scenario

Atos (Spain); Diaz Fragachan, Eleonor

Publication date: 22/01/2026 - DOI: 10.5281/zenodo.18338792

Synthetic RGB image dataset depicting 600 images of forest fires in varied scenarios generated with AI methods.
This data can be used to train/evaluate fire detection models on RGB images.

FireScenarios/
├── images/ (image_00001_.jpg → image_00600_.jpg)
├── annotated_images/ (image_00001_.jpg → image_00600_.jpg)
└── labels_filtered/ (image_00001_.txt → image_00600_.txt)

TEMA AIIA_flood Dataset

Apostolidis, Apostolos; Pitas, Ioannis

Publication date: 26/01/2026 - DOI: 10.5281/zenodo.18377521

General description of the dataset

The dataset for the flood binary segmentation task comprises 720 images consolidated from two sources: Mantoudi and Arthal trials. Mantoudi contributes with 338 images (198 train set, 140 test set) and Arthal with 382 images (230 train set, 152 test set), resulting in an approximate 60% – 40% training-validation distribution. The annotation masks are binary, where pixels are labeled as 0 for background and 1 for floodwater. If one uses any part of these datasets in his/her work, he/she is kindly asked to cite the following papers.

P. Mentesidis, V. Mygdalis and I.Pitas, "Improve Real-time flood segmentation by encoding and distilling foreground information", IEEE International Conference on Image Processing (ICIP), Anchorage, Alaska, USA, 13-17 September, 2025.
A. Gerontopoulos, D. Papaioannou, C. Papaioannidis and I.Pitas, "Real-Time Flood Water Segmentation with Deep Neural Networks", IEEE 25th International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW), Tromsø, Norway, pp. 85-91, 2025

Dataset Structure

It is structured into distinct directories for each source (Mantoudi and Arthal), each containing standard train and validation splits with separate folders for images (.jpg) and labels (.png).

Mora information about the dataset can be found here.

AUW Dataset

Spatharis, Evangelos; Apostolidis, Apostolos; Pitas, Ioannis

Publication date: 23/01/2026 - DOI: 10.5281/zenodo.18351255

General description of the dataset

The sample dataset, called the AUTH-Unreal-Wildfire (AUW) dataset, is a synthetic collection created to advance deep learning for wildfire segmentation. It addresses the critical challenge of obtaining accurately annotated training data in natural disaster management by using a novel, open-source pipeline built with the AirSim simulator. This pipeline uniquely integrates a custom particle segmentation camera and Procedural Content Generation (PCG) tools to produce photorealistic wildfire images paired with precise pixel-level segmentation masks—a feature previously difficult to achieve since fire assets are typically particle-based without a defined 3D mesh. The dataset consists of 1,500 training and 200 test images and was specifically designed to train and evaluate state-of-the-art segmentation models like PIDNet, both on its own and as a data augmentation resource to enhance performance on real-world wildfire imagery.

For a comprehensive explanation of the methodology and tools used to create this synthetic dataset, please refer to the full conference paper. This work is formally published and should be cited as follows: E. Spatharis, C. Papaioannidis, V. Mygdalis and I. Pitas, “UNREALFIRE: A synthetic dataset creation pipeline for annotated fire imagery in Unreal Engine”, IEEE International Conference on Image Processing (ICIP), Workshop on Bridging the Gap: Advanced Data Processing for Natural Disaster Management – Integrating Visual and Non-Visual Insights, Anchorage, Alaska, USA, 13-17 September, 2025. The paper is available at: https://aiia.csd.auth.gr/wp-content/uploads/2025/12/SPATHARIS_ICIP_2025.pdf and at https://zenodo.org/records/18198757 .

If one uses any part of these datasets in his/her work, he/she is kindly asked to cite the following paper:

E. Spatharis, C. Papaioannidis, V. Mygdalis and I.Pitas, "UNREALFIRE: A synthetic dataset creation pipeline for annotated fire imagery in Unreal Engine", IEEE International Conference on Image Processing (ICIP), Workshop on Bridging the Gap: Advanced Data Processing for Natural Disaster Management – Integrating Visual and Non-Visual Insights, Anchorage, Alaska, USA, 13-17 September, 2025

Dataset Structure

The dataset is organized into two primary directories representing the training and test sets, each of which contains the corresponding images and their annotation labels.

Details on acquiring the dataset can be found here

Geo-social media and AI for Early WarningA data source for multifaceted spatiotemporal information

GANDHI, SHAILY; Schmidt, Sebastian; Hanny, David; Keskin, Merve; Resch, Bernd

Publication date: 25/05/2024 - DOI: 10.5281/zenodo.18241207

Leveraging Collective Knowledge for Forest Fire Classification

Kaimakamidis, Anestis; Pitas, Ioannis

Publication date: 15/01/2026 - DOI: 10.1109/ISCC61673.2024.10733691

This paper presents a novel Fire Classification Multi-Agent (FCMA) framework that utilizes peer-to-peer learning and distributed learning techniques to disseminate knowledge within the agent community. Furthermore, we define and introduce the architecture of a Deep Neural Network (DNN) agent, which can infinitely interact with other DNN agents and the external environment upon deployment. The FCMA framework is suitable for natural disaster management systems where multiple agents are required to run autonomously and foster the community’s knowledge. The FCMA provides two options for knowledge transfer, a peer-to-peer and a federated one. The experimental results display the effective knowledge transfer using both options and also compare the two options with each other in a forest fire classification setting.

An Aspect-Based Emotion Analysis Approach on Wildfire-Related Geo-Social Media Data—A Case Study of the 2020 California Wildfires

Zorenböhmer, Christina; GANDHI, SHAILY; Schmidt, Sebastian; Resch, Bernd

Publication date: 04/04/2025 - DOI: 10.3390/ijgi14080301

Natural disasters like wildfires pose significant threats to communities, which necessitates timely and effective disaster response strategies. While Aspect-based Sentiment Analysis (ABSA) has been widely used to extract sentiment-related information at the sub-sentence level, the corresponding field of Aspect-based Emotion Analysis (ABEA) remains underexplored due to dataset limitations and the increased complexity of emotion classification. In this study, we used EmoGRACE, a fine-tuned BERT-based model for ABEA, which we applied to georeferenced tweets of the 2020 California wildfires. The results for this case study reveal distinct spatio-temporal emotion patterns for wildfire-related aspect terms, with fear and sadness increasing near wildfire perimeters. This study demonstrates the feasibility of tracking emotion dynamics across disaster-affected regions and highlights the potential of ABEA in real-time disaster monitoring. The results suggest that ABEA can provide a nuanced understanding of public sentiment during crises for policymakers.

Enhancing Disaster Response with Social Media Analytics

GANDHI, SHAILY; Schmidt, Sebastian; Hanny, David; Keskin, Merve; Resch, Bernd

Publication date: 14/01/2026 - DOI: 10.5281/zenodo.18241277

An aspect-based emotion analysis approach on wildfire-related geo-social media data

GANDHI, SHAILY; Zorenböhmer, Christina; Schmidt, Sebastian; Resch, Bernd

Publication date: 28/05/2025 - DOI: 10.5281/zenodo.18241370

Datasets and Publications

Publications about the project