The automatic identification of burned areas is an important task that was mainly managed manually or semi-automatically in the past. In the last years, thanks to the availability of novel deep neural network architectures, automatic segmentation solutions have been proposed also in the emergency management domain. The most recent works in burned area delineation leverage on Convolutional Neural Networks (CNNs) to automatically identify regions that were previously affected by forest wildfires. A largely adopted segmentation model, U-Net, demonstrated good performances for the task under analysis, but in some cases a high overestimation of burned areas is given, leading to low precision scores. Given the recent advances in the field of NLP and the first successes also in the vision domain, in this paper we investigate the adoption of vision transformers for semantic segmentation to address the burned area identification task. In particular, we explore the SegFormer architecture with two of its variants: the smallest MiT-B0 and the intermediate one MiT-B3. The experimental results show that SegFormer provides better predictions, with higher precision and F1 score, but also better performance in terms of the number of parameters with respect to CNNs.
Vision Transformers for Burned Area Delineation / Rege Cambrin, Daniele; Colomba, Luca; Garza, Paolo. - 3343:(2023). (Intervento presentato al convegno MACLEAN: MAChine Learning for EArth ObservatioN (workshop @ECML/PKDD2022) tenutosi a Grenoble (FR) nel 19/09/2022).
Vision Transformers for Burned Area Delineation
Rege Cambrin,Daniele;Colomba, Luca;Garza, Paolo
2023
Abstract
The automatic identification of burned areas is an important task that was mainly managed manually or semi-automatically in the past. In the last years, thanks to the availability of novel deep neural network architectures, automatic segmentation solutions have been proposed also in the emergency management domain. The most recent works in burned area delineation leverage on Convolutional Neural Networks (CNNs) to automatically identify regions that were previously affected by forest wildfires. A largely adopted segmentation model, U-Net, demonstrated good performances for the task under analysis, but in some cases a high overestimation of burned areas is given, leading to low precision scores. Given the recent advances in the field of NLP and the first successes also in the vision domain, in this paper we investigate the adoption of vision transformers for semantic segmentation to address the burned area identification task. In particular, we explore the SegFormer architecture with two of its variants: the smallest MiT-B0 and the intermediate one MiT-B3. The experimental results show that SegFormer provides better predictions, with higher precision and F1 score, but also better performance in terms of the number of parameters with respect to CNNs.File | Dimensione | Formato | |
---|---|---|---|
MACLEAN2022_paper.pdf
non disponibili
Descrizione: Articolo principale
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
1.22 MB
Formato
Adobe PDF
|
1.22 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
MACLEAN2022_paper_final.pdf
accesso aperto
Descrizione: Published version
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Creative commons
Dimensione
2.03 MB
Formato
Adobe PDF
|
2.03 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2971022