In this paper, we propose a new open-source benchmarking framework for Visual Geo-localization (VG) that allows to build, train, and test a wide range of commonly used architectures, with the flexibility to change individual components of a geo-localization pipeline. The purpose of this framework is twofold: i) gaining insights into how different components and design choices in a VG pipeline impact the final results, both in terms of performance (recall@N metric) and system requirements (such as execution time and memory consumption); ii) establish a systematic evaluation protocol for comparing different methods. Using the proposed framework, we perform a large suite of experiments which provide criteria for choosing backbone, aggregation and negative mining depending on the use-case and requirements. We also assess the impact of engineering techniques like pre/post-processing, data augmentation and image resizing, showing that better performance can be obtained through somewhat simple procedures: for example, downscaling the images' resolution to 80% can lead to similar results with a 36% savings in extraction time and dataset storage requirement. Code and trained models are available at https://deep-vg-bench.herokuapp.com/.
Deep Visual Geo-Localization Benchmark / Berton, Gabriele; Mereu, Riccardo; Trivigno, Gabriele; Masone, Carlo; Csurka, Gabriela; Sattler, Torsten; Caputo, Barbara. - ELETTRONICO. - (2022), pp. 5386-5397. (Intervento presentato al convegno 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) tenutosi a New Orleans (USA) nel 18-24 June 2022) [10.1109/CVPR52688.2022.00532].
Deep Visual Geo-Localization Benchmark
Gabriele Berton;Gabriele Trivigno;Carlo Masone;Barbara Caputo
2022
Abstract
In this paper, we propose a new open-source benchmarking framework for Visual Geo-localization (VG) that allows to build, train, and test a wide range of commonly used architectures, with the flexibility to change individual components of a geo-localization pipeline. The purpose of this framework is twofold: i) gaining insights into how different components and design choices in a VG pipeline impact the final results, both in terms of performance (recall@N metric) and system requirements (such as execution time and memory consumption); ii) establish a systematic evaluation protocol for comparing different methods. Using the proposed framework, we perform a large suite of experiments which provide criteria for choosing backbone, aggregation and negative mining depending on the use-case and requirements. We also assess the impact of engineering techniques like pre/post-processing, data augmentation and image resizing, showing that better performance can be obtained through somewhat simple procedures: for example, downscaling the images' resolution to 80% can lead to similar results with a 36% savings in extraction time and dataset storage requirement. Code and trained models are available at https://deep-vg-bench.herokuapp.com/.| File | Dimensione | Formato | |
|---|---|---|---|
| Berton_Deep_Visual_Geo-Localization_Benchmark_CVPR_2022_paper.pdf accesso aperto 
											Tipologia:
											2a Post-print versione editoriale / Version of Record
										 
											Licenza:
											
											
												Pubblico - Tutti i diritti riservati
												
												
												
											
										 
										Dimensione
										1.09 MB
									 
										Formato
										Adobe PDF
									 | 1.09 MB | Adobe PDF | Visualizza/Apri | 
| Deep_Visual_Geo-localization_Benchmark.pdf accesso riservato 
											Tipologia:
											2a Post-print versione editoriale / Version of Record
										 
											Licenza:
											
											
												Non Pubblico - Accesso privato/ristretto
												
												
												
											
										 
										Dimensione
										937.01 kB
									 
										Formato
										Adobe PDF
									 | 937.01 kB | Adobe PDF | Visualizza/Apri Richiedi una copia | 
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2970714
			
		
	
	
	
			      	