The automatic identification of burned areas is an important task that was mainly managed manually or semi-automatically in the past. In the last years, thanks to the availability of novel deep neural network architectures, automatic segmentation solutions have been proposed also in the emergency management domain. The most recent works in burned area delineation leverage on Convolutional Neural Networks (CNNs) to automatically identify regions that were previously affected by forest wildfires. A largely adopted segmentation model, U-Net, demonstrated good performances for the task under analysis, but in some cases a high overestimation of burned areas is given, leading to low precision scores. Given the recent advances in the field of NLP and the first successes also in the vision domain, in this paper we investigate the adoption of vision transformers for semantic segmentation to address the burned area identification task. In particular, we explore the SegFormer architecture with two of its variants: the smallest MiT-B0 and the intermediate one MiT-B3. The experimental results show that SegFormer provides better predictions, with higher precision and F1 score, but also better performance in terms of the number of parameters with respect to CNNs.

Vision Transformers for Burned Area Delineation / Rege Cambrin, Daniele; Colomba, Luca; Garza, Paolo. - 3343:(2023). (Intervento presentato al convegno MACLEAN: MAChine Learning for EArth ObservatioN (workshop @ECML/PKDD2022) tenutosi a Grenoble (FR) nel 19/09/2022).

Vision Transformers for Burned Area Delineation

Rege Cambrin,Daniele;Colomba, Luca;Garza, Paolo
2023

Abstract

The automatic identification of burned areas is an important task that was mainly managed manually or semi-automatically in the past. In the last years, thanks to the availability of novel deep neural network architectures, automatic segmentation solutions have been proposed also in the emergency management domain. The most recent works in burned area delineation leverage on Convolutional Neural Networks (CNNs) to automatically identify regions that were previously affected by forest wildfires. A largely adopted segmentation model, U-Net, demonstrated good performances for the task under analysis, but in some cases a high overestimation of burned areas is given, leading to low precision scores. Given the recent advances in the field of NLP and the first successes also in the vision domain, in this paper we investigate the adoption of vision transformers for semantic segmentation to address the burned area identification task. In particular, we explore the SegFormer architecture with two of its variants: the smallest MiT-B0 and the intermediate one MiT-B3. The experimental results show that SegFormer provides better predictions, with higher precision and F1 score, but also better performance in terms of the number of parameters with respect to CNNs.
File in questo prodotto:
File Dimensione Formato  
MACLEAN2022_paper.pdf

accesso riservato

Descrizione: Articolo principale
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.22 MB
Formato Adobe PDF
1.22 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
MACLEAN2022_paper_final.pdf

accesso aperto

Descrizione: Published version
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Creative commons
Dimensione 2.03 MB
Formato Adobe PDF
2.03 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2971022