Skip to main content

Synthetic Images in People Detection: Balancing Effectiveness and Ethical Imperatives

Resulting Image after the outpainting process

Eleonor Diaz, Computer Vision Engineer at Atos Spain.

In recent years, the development and refinement of artificial intelligence (AI) technologies have revolutionised various fields, including computer vision. One notable application within computer vision is the detection and reidentification of individuals, a crucial task in surveillance, security, and social applications. With the advent of synthetic image generation techniques, researchers and practitioners have increasingly turned to synthetic data to train and improve people detection and reidentification models. This article explores the usage of synthetic images in this context, highlighting its benefits, challenges, and future implications.

Benefits of Synthetic Images in Training People Detection Models:

  • Diversity and Variability: Synthetic image generation enables the creation of diverse and realistic datasets, encompassing various scenarios, lighting conditions, and backgrounds. This diversity helps in training robust people detection models capable of handling real-world complexities.
  • Annotation Efficiency: Annotating real-world images with ground truth labels for training purposes can be labour-intensive and expensive. Synthetic datasets alleviate this challenge by providing readily annotated data, facilitating the training process and reducing annotation costs.
  • Control Over Data Distribution: Synthetic images offer unparalleled control over the distribution of data, allowing researchers to simulate specific scenarios and conditions relevant to their application. This control enhances the adaptability and generalisation of people detection models across different environments.
  • Privacy Preservation: One of the foremost benefits of utilising synthetic images in training people detection and reidentification models lies in privacy preservation. Traditional methods of collecting real-world images for training datasets often involve capturing individuals' faces and personal attributes, raising significant privacy concerns. By leveraging synthetic images, which are generated programmatically and do not depict real individuals, privacy risks associated with the use of personal data are mitigated. This approach ensures that sensitive information about individuals, such as their appearance, identity, and behaviour remain safeguarded throughout the model training process.
  • Anonymity and Confidentiality: Synthetic image generation techniques allow for the creation of anonymised representations of individuals, eliminating the need to rely on real-world images that may contain identifiable information. As a result, privacy-sensitive data, such as facial features and biometric identifiers, are not directly exposed or shared during the model training phase. This anonymisation process preserves the confidentiality of individuals' identities and reduces the risk of unauthorised access or misuse of personal information, aligning with principles of data protection and privacy regulation.

  • Ethical Considerations: In addition to legal compliance, the use of synthetic images addresses ethical considerations surrounding the collection and use of personal data in AI applications. By minimising the reliance on real-world images, which may infringe upon individuals' privacy rights and autonomy, synthetic data-driven approaches uphold ethical principles of respect, fairness, and consent. This ethical framework prioritises the well-being and dignity of individuals while still enabling the development of effective people detection and reidentification models for various societal and security applications. 

Rather than relying solely on real-world data containing sensitive information about individuals, we utilise synthetic images depicting fictitious persons. These synthetic personas serve as anonymised representations, ensuring the confidentiality and privacy of individuals' identities throughout the model training process. 

Our methodology involves the creation of multiple images for each fictitious persona, encompassing various poses, orientations, and scenarios. By extracting fundamental concepts and characteristics from each synthetic individual, we generate a diverse array of images, thereby enriching the dataset with a wide range of visual attributes. This approach not only enhances the robustness and generalisation capabilities of person reidentification models but also mitigates the risk of privacy infringement associated with the use of real-world data. 

Upon assembling a comprehensive set of images for the synthetic personas, we employ advanced techniques such as Outpainting to further enhance diversity and realism. By randomising the scenes in which the synthetic individuals are depicted, we ensure that the generated images capture a broad spectrum of environmental contexts and visual conditions. This process adds another layer of variability to the dataset, facilitating more effective training of person reidentification models while preserving the anonymity and confidentiality of individuals' identities. The following is an example of this approach with a persona:

Figure 1: Image depicting the Persona

Image depicting the Persona

Figure 2: Examples of images generated with the characteristics of the Persona

Examples of images generated with the characteristics of the Persona


Figure 3: Resulting Images after the outpainting process

Resulting Images after the outpainting processResulting Images after the outpainting process


In conclusion, the utilisation of synthetic images in training people detection and reidentification models offers a myriad of benefits, including enhanced diversity, annotation efficiency, control over data distribution, and most importantly, privacy preservation. By leveraging synthetic data, researchers and practitioners can develop robust and adaptable models capable of addressing real-world complexities without compromising individuals' privacy rights. Furthermore, the ethical considerations inherent in the use of synthetic images align with principles of respect, fairness, and consent, ensuring that AI technologies uphold the dignity and well-being of individuals. Moving forward, continued research and innovation in synthetic image generation techniques promise to further advance the capabilities of people detection and reidentification models while safeguarding privacy and promoting responsible AI development.