Skip to main content

Publications about the project

Project publications are originally saved on a Zenodo community. Access the project's community page to see the details.
Displaying 81-87 of 87 records

Few-Shot Learning for Relevance Classification of Textual Social Media Posts in Disaster Response

Publication date: 26/06/2025 - DOI: 10.5281/zenodo.18234131
Social media can provide real-time insights during natural disasters, yet efficiently identifying relevant content remains a challenge due to the reliance on large labelled datasets and high computational costs. This study therefore investigates the potential of Few-Shot Learning (FSL) for relevance classification of textual social media posts during disasters. We compare few-shot prompting using eight Small Language Models (SLMs) and a contrastive learning approach (SetFit) with data from five disasters across the world: the 2020 California wildfires, 2021 Ahr Valley floods, 2023 Chile wildfires, 2023 Emilia-Romagna floods, and 2023 Turkey/Syria earthquake. GPT-4o-mini achieves the highest average macro F1 score (0.77) using just five labelled examples per class, while the multilingual-e5-base model fine-tuned with SetFit offers a strong alternative (avg. macro F1 = 0.65) without reliance on prompt engineering. Our findings highlight the potential of SLMs and FSL for scalable and resource-efficient data analytics in disaster management and broader social science research.

STRUCTURED EFFICIENT SELF-ATTENTION SHOWCASED ON DETR-BASED DETECTORS

Militsis, Nikolaos Marios; Mygdalis, Vasileios; Pitas, Ioannis
Publication date: 07/01/2025 - DOI: 10.5281/zenodo.14608445

© 2025 N. Militsis, V. Mygdalis, I. Pitas. This is the authors' version of the work. It is posted here for your personal use. Not for redistribution

 

The Multi-Head Self-Attention (MHSA) mechanism stands as the cornerstone of Transformer architectures, endowing them with unparalleled expressive capabilities. The main learnable parameters in a transformer self-attention block include matrices that project the input features into subspaces, where similarity metrics are thereby calculated. In this paper, we argue that we could use less learnable parameters for achieving good projections. We propose the Structured Efficient Self-Attention (SESA) module, a generic paradigm inspired by the Johnson-Lindenstrauss (JL) lemma, that employs an Adaptive Fast JL Transform (A-FJLT) parameterised by a single learnable vector for each projection. This allows us to eliminate a substantial 75% of the learnable parameters of the legacy MHSA, with very slight sacrifices to accuracy. SESA properties are showcased on the demanding task of object detection at the COCO dataset, achieving comparable performance with its computationally intensive counterparts.

These Maps Are Made by Propagation: Adapting Deep Stereo Networks to Road Scenarios with Decisive Disparity Diffusion

Chuang-Wei Liu; Yikang Zhang; Qijun Chen; Ioannis Pitas; Rui Fan
Publication date: 06/11/2024 - DOI: 10.48550/arXiv.2411.03717

Stereo matching has emerged as a cost-effective solution for road surface 3D reconstruction, garnering significant attention towards improving both computational efficiency and accuracy. This article introduces decisive disparity diffusion (D3Stereo), marking the first exploration of dense deep feature matching that adapts pre-trained deep convolutional neural networks (DCNNs) to previously unseen road scenarios. A pyramid of cost volumes is initially created using various levels of learned representations. Subsequently, a novel recursive bilateral filtering algorithm is employed to aggregate these costs. A key innovation of D3Stereo lies in its alternating decisive disparity diffusion strategy, wherein intra-scale diffusion is employed to complete sparse disparity images, while inter-scale inheritance provides valuable prior information for higher resolutions. Extensive experiments conducted on our created UDTIRI-Stereo and Stereo-Road datasets underscore the effectiveness of D3Stereo strategy in adapting pre-trained DCNNs and its superior performance compared to all other explicit programming-based algorithms designed specifically for road surface 3D reconstruction. Additional experiments conducted on the Middlebury dataset with backbone DCNNs pre-trained on the ImageNet database further validate the versatility of D3Stereo strategy in tackling general stereo matching problems.

3D-Flood Dataset

Publication date: 27/05/2024 - DOI: 10.5281/zenodo.11349721

The Aristotle University of Thessaloniki (hereinafter, AUTH) created the following dataset, entitled ‘3D-Flood’, within the context of the project TEMA that was funded by the European Commission-European Union.

The dataset will be used for the construction of a 3D model regarding the district of Agios Thomas in Larisa, Greece, after the flood events of 2023. It is comprised of 795 UAV video frames, taken from 4 YouTube videos.

We provide the links for each YouTube video, along with the frame numbers that we kept for each video.

Details on acquiring the dataset can be found here.

Flood Master Dataset

Kitsos, Filippos; Zamioudis, Alexandros
Publication date: 06/06/2024 - DOI: 10.5281/zenodo.11501494

Our Master Flood Dataset consists of flood images picked from different publicly available datasets. The origins of the images is specified in the "sources.csv" file.

The dataset consists of 282 train, 87 validation and 1973 test frames. We provide the frames from the sourced videos and segmentation masks of the flooded areas.

Details on acquiring the dataset can be found here

Blaze Fire Classification – Segmentation Dataset

Michalis, Siamvrakas; Kitsos, Filippos
Publication date: 06/06/2024 - DOI: 10.5281/zenodo.11501836

The dataset is destined to be used for wildfire image classification and burnt area segmentation tasks for Unmanned Aerial Vehicles. It is comprised of 5,408 frames of aerial views taken from 56 videos and 2 public datasets. From the D-Fire public dataset, 829 photographs were used; and from the Burned Area UAV public dataset 34 images were used. For the classification task, there are 5 classes (‘Burnt’, ‘Half-Burnt’, ’Non-Burnt’, ‘Fire’, ‘Smoke’). As for the segmentation task, 404 segmentation masks on a subset have been created, which assign to each pixel of the image the class ‘burnt’ or the class ‘non-burnt’.

Details on acquiring the dataset can be found here

 

Mastodon Posts Dataset

Avgoustidis, Fotios; Giannouris, Polydoros; Kitsos, Filippos
Publication date: 06/06/2024 - DOI: 10.5281/zenodo.11502116

The dataset comprises of 766 social media posts in Greek language from the platform “Mastodon” spanning the 2023 wildfires in Greece. Each post was annotated internally with Plutchik-8 emotions. 

Details on acquiring the dataset can be found here