Du
Shedule
Place Conférence room BU Campus LyonTech La Doua
THESIS
Thesis defence Gaspart Dussert
The defence will be held in French, before a jury composed of:
François Munoz, PU, University of Lyon 1, Examiner
Alice Caplier, PU, INP Phelma, Reviewer
Marie-Pierre Etienne, Associate Professor, ENSAI, Reviewer
Julien Mairal, DR, Inria, Examiner
Stéphane Dray, DR, CNRS, LBBE, Thesis Director
Vincent Miele, IR, CNRS, LECA, Co-director of thesis
Simon Chamaillé-Jammes, DR, CNRS, CEFE, Thesis Supervisor
Thesis summary:
Large-scale ecosystem monitoring has become a major challenge in the context of the biodiversity crisis, as it is essential to fill critical knowledge gaps that hinder the design of effective management and conservation strategies. To address this, modern ecological monitoring relies on a variety of autonomous sensors to collect data in a continuous and standardized manner. In this context, this thesis focuses on camera traps, which have become essential tools for wildlife studies. However, these devices generate massive volumes of images, and manual processing represents a major bottleneck for both research and conservation. Artificial intelligence offers a promising solution by automating the analysis of such data. The objectives of this thesis are to develop and implement new deep learning methods to improve species and behaviour classification, enhance the interpretability of model predictions, and make these advances accessible to the ecological community through open-source tools. The first chapter presents the DeepFaune initiative, a collaborative project aiming to create the first large-scale dataset dedicated to European fauna and to develop efficient detection and classification models that can be easily used on personal computers through a dedicated software. The second chapter addresses the problem of confidence score calibration and demonstrates how temporal aggregation techniques and post-processing can improve the reliability of predictions, thereby facilitating their integration into downstream ecological models. The third chapter introduces a new module based on the self-attention mechanism to jointly exploit spatial and temporal information within image sequences, leading to improved classification performance, even in multi-species scenarios. Finally, the last chapter explores the potential of vision-language models for zero-shot animal behaviour prediction, i.e., without fine-tuning and for a task for which they have not been explicitly trained. Results show that their predictions are sufficiently reliable to estimate ecological indicators such as activity patterns. The methods developed throughout this work have been directly integrated through the DeepFaune software, which is now widely adopted across Europe, as well as through publicly available libraries and models. The species classification model has also been incorporated into other popular tools such as AddaxAI and Agouti, thereby facilitating the processing of millions of camera trap images and helping the automation of ecological monitoring. This thesis also opens new perspectives by promoting the use of vision-language models to predict ecological attributes that are rarely annotated, while also encouraging the development of vision-only models that leverage sequence information to improve animal detection. Together, these developments can strengthen the versatility and robustness of AI tools, ultimately enhancing their capacity to meet the growing demands of ecological studies.