Publications about the project
Augmented images Flood Scenario
RGB dataset depicting 218 real drone high resolution images on the town of Altenahr after the floods of 2021 provided by DLR, and 218 augmented images where floods, debris and trapped persons where added to the real images using AI methods.
Floods_original.zip contains the original 218 jpg images with metadata from the drone.
Floods_augmented.zip contains the augmented 218 jpg images with the same metadata.
Flood and Fire Real Description Generated Synthetic Dataset
Synthetic RGB image dataset depicting 550 images of forest fires and 500 images of floods generated with descriptions of real images with AI methods.
This data can be used to train/evaluate fire and floods detection models on RGB images.
FireDescription.txt contains the 11 descriptions from real images that were used as a prompt for the fire images
FloodsDescription.txt contains the 10 descriptions from real images that were usesd as a prompt for the flood images
DescriptionDataset/
├── FireJPG/ (fire1__00001_.jpg → fire11__00050_.jpg)
├── FloodsJPG/ (flood1__00001_.jpg → flood10__00050_.jpg)
├── FireDescription.txt
└── FloodsDescription.txt
Unsupervised Multimodal Graph-based Model for Geo-social Analysis
The systematic analysis of user-generated social media content, especially when enriched with geospatial context, plays a vital role in domains such as disaster management and public opinion monitoring. Although multimodal approaches have made significant progress, most existing models remain fragmented, processing each modality separately rather than integrating them into a unified end-to-end model. To address this, we propose an unsupervised, multimodal graph-based methodology that jointly embeds semantic and geographic information into a shared representation space. The proposed methodology comprises two architectural paradigms: a mono graph (MonoGrah) model that jointly encodes both modalities, and a multi graph (MultiGraph) model that separately models semantic and geographic relationships and subsequently integrates them through multi-head attention mechanisms. A composite loss, combining contrastive, coherence, and alignment objectives, guides the learning process to produce semantically coherent and spatially compact clusters. Experiments on four real-world disaster datasets demonstrate that our models consistently outperformexisting baselines in topic quality, spatial coherence, and interpretability. Inherently domain-independent, the framework can be readily extended to diverse forms of multimodal data and a wide range of downstream analysis tasks.
Enhancing satellite-based emergency mapping: Identifying wildfires through geo-social media analysis
When a disaster emerges, timely acquisition of information is crucial for a rapid situation assessment. Although automation in the standard satellite-based emergency mapping workflow has been advanced, delays still occur at crucial steps. In order to speed up the provision of satellite-based crisis products to emergency managers, this paper proposes a geo-social media-based approach that detects disaster events based on the spatio-temporal analysis of georeferenced, disaster-related Tweets. The proposed methodology is validated on the basis of two use cases: wildfires in Chile and British Columbia. The results show the general ability of Twitter to forecast events several days in advance, at least for the Chile use case. However, there are large spatial differences, as there is a correlation between population density and the reliability of Twitter data. Consequently, only few meaningful alerts could be generated for British Columbia, an area with very low population numbers.
Multimodal GeoAI: An integrated spatio-temporal topic-sentiment model for the analysis of geo-social media posts for disaster management
A multimodal GeoAI approach to combining text with spatiotemporal features for enhanced relevance classification of social media posts in disaster response
Geo-referenced social media data supports disaster management by offering real-time insights through user-generated content. To identify critical information amid high volumes of noise, classifying the relevance of posts is essential. Most existing methods primarily use textual features, neglecting spatial and temporal context despite its importance in determining relevance. This study proposes a multimodal approach that integrates text with spatiotemporal features for relevance classification of geo-referenced social media posts. We evaluate our method on 4,574 manually labelled posts from five disasters: the 2020 California wildfires, 2021 Ahr Valley floods, 2023 Chile wildfires, 2023 Turkey earthquake and 2023 Emilia-Romagna floods. Labels were assigned based on text, geographic location and time. Our spatiotemporal features include proximity to disaster impact sites, local co-occurrences with disaster-related posts, event type and geographic context. When utilised on their own, they achieved a macro F1 score of 0.713 with a random forest classifier. A fine-tuned TwHIN-BERT-base model using only text scored 0.779. For multimodal classification, we tested feature concatenation, in-context learning, stacking and partial stacking. Partial stacking produced the highest macro F1 score (0.814). Our multilingual, context-aware classification approach lays the groundwork for more integrated GeoAI applications in disaster management, the social sciences and beyond.
Clustering-Based Joint Topic-Sentiment Modeling of Social Media Data: A Neural Networks Approach
With the vast amount of social media posts available online, topic modeling and sentiment analysis have become central methods to better understand and analyze online behavior and opinion. However, semantic and sentiment analysis have rarely been combined for joint topic-sentiment modeling which yields semantic topics associated with sentiments. Recent breakthroughs in natural language processing have also not been leveraged for joint topic-sentiment modeling so far. Inspired by these advancements, this paper presents a novel framework for joint topic-sentiment modeling of short texts based on pre-trained language models and a clustering approach. The method leverages techniques from dimensionality reduction and clustering for which multiple algorithms were considered. All configurations were experimentally compared against existing joint topic-sentiment models and an independent sequential baseline. Our framework produced clusters with semantic topic quality scores of up to 0.23 while the best score among the previous approaches was 0.12. The sentiment classification accuracy increased from 0.35 to 0.72 and the uniformity of sentiments within the clusters reached up to 0.9 in contrast to the baseline of 0.56. The presented approach can benefit various research areas such as disaster management where sentiments associated with topics can provide practical useful information.
Multimodal Geo-Information Extraction from Social Media for Supporting Decision-Making in Disaster Management
Effective decision-making in natural disaster management relies heavily on a comprehensive understanding of the situation in affected areas. Social media has been established as a tool to monitor human response and damage assessment. Given the vast amounts of data available, computational methods such as topic modelling are typically employed to reduce information complexity. However, these methods mostly neglect aspects such as geographic location and emotional response, which frequently results in sequential workflows of initial semantic filtering and subsequent spatial or spatio-temporal analysis. This study presents a novel approach for multimodal information extraction from geo-social media data for aiding decision support in disaster management. The method leverages a spatial, temporal, semantic, and sentiment-based clustering approach of social media posts to extract clusters that provide insights into disaster-related content. A case study in the Ahr Valley region in Germany demonstrates the method’s effectiveness in providing actionable insights for disaster response and management. The approach offers a tool for the quick assessment of disaster-related information from social media, potentially aiding timely and informed decision-making.
Active Learning for Identifying Disaster-Related Tweets: A Comparison with Keyword Filtering and Generic Fine-Tuning
Information from social media can provide essential information for emergency response during natural disasters in near real-time. However, it is a difficult task to identify the disaster-related posts among the large amount of unstructured data available. Previous methods often use keyword filtering, topic modelling or classification-based techniques to identify such posts. Active Learning (AL) presents a promising sub-field of Machine Learning (ML) that has not been used much in the field of text classification of social media content. This study therefore investigates the potential of AL for identifying disaster-related Tweets. We compare a keyword filtering approach, a RoBERTa model fine-tuned with generic data from CrisisLex, a base RoBERTa model trained with AL and a fine-tuned RoBERTa model trained with AL regarding classification performance. For testing, data from CrisisLex and manually labelled data from the 2021 flood in Germany and the 2023 Chile forest fires were considered. The results show that generic fine-tuning combined with 10 rounds of AL outperformed all other approaches. Consequently, a broadly applicable model for the identification of disaster-related Tweets could be trained with very little labelling effort. The model can be applied to use cases beyond this study and provides a useful tool for further research in social media analysis.
Assessing the spatial accuracy of geocoding flood-related imagery using Vision Language Models
While the capabilities of large language models and visual language models for various classification tasks have advanced significantly, their potential for location inference remains largely underexplored. Therefore, this study evaluates the performance of four prominent models — BLIP-2, LLaVA1.6, OpenFlamingo, and GPT-4o — for geocoding flood-related images from Flickr. Model inferences are compared against the original photo locations and human-labelled assessments. Our findings reveal that GPT-4o achieves the highest spatial accuracy (median deviation of 89.12 km). OpenFlamingo geocodes the highest number of images (90.7%), albeit with fluctuating quality (median 408.35 km), while still outperforming the human annotators. LLaVA1.6 geocodes only 18.9% of all images, while BLIP-2 exhibits the highest median deviation (1,781 km). We observe a spatial bias in our results, with inferences being most accurate in Central Europe. Additionally, model results improve when images feature recognisable landmarks. The proposed workflow could significantly increase the amount of geocoded web-based data available for disaster management, though further research is required to enhance accuracy across diverse geographic contexts.