Promptable image segmentation: a survey of guided input techniques

Nejabat, Hadi; D'Asaro, Federico; Pecora, Alessandro Emmanuel; Monopoli, Tommaso; Bottino, Andrea

doi:10.1108/FTCGV-03-2026-001

Prompt-based image segmentation has revolutionized computer vision by enabling more adaptive and efficient segmentation through prompts. In the context of image segmentation, the term prompt broadly refers to any auxiliary input, such as clicks, boxes, scribbles, support sets or free-form text that guides a model’s segmentation behavior. These inputs operate as task-specific signals that enable models to adapt their segmentation behavior to different contexts and objectives. This survey categorizes promptable image segmentation into five primary areas: interactive segmentation, referring segmentation, few-shot semantic segmentation, open vocabulary segmentation and foundation models.The authors explore how different prompting strategies improve segmentation performance while enabling few-shot learning and reducing reliance on extensive labeled datasets. The discussion highlights the role of foundation models in advancing segmentation capabilities by integrating separate components of these complex models and leveraging multimodal interactions.By synthesizing state-of-the-art techniques, this study provides a structured taxonomy, identifies key challenges in multimodal fusion and generalization and outlines future directions for developing more intelligent and adaptable segmentation systems.

Promptable image segmentation: a survey of guided input techniques / Nejabat, Hadi; D'Asaro, Federico; Pecora, Alessandro Emmanuel; Monopoli, Tommaso; Bottino, Andrea. - In: FOUNDATIONS AND TRENDS IN COMPUTER GRAPHICS AND VISION. - ISSN 1572-2740. - ELETTRONICO. - 18:1(2026), pp. 1-139. [10.1108/FTCGV-03-2026-001]

Promptable image segmentation: a survey of guided input techniques

Hadi Nejabat;Federico D'Asaro;Alessandro Emmanuel Pecora;Tommaso Monopoli;Andrea Bottino

2026

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2026
			
	Codice DOI
	
				https://dx.doi.org/10.1108/FTCGV-03-2026-001
			
	Titolo della Rivista
	
				FOUNDATIONS AND TRENDS IN COMPUTER GRAPHICS AND VISION
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
SEG_paper_NOWP_Manuscript-2.pdf accesso riservato Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 15.2 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	15.2 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
SEG_paper_openaccess.pdf accesso aperto Tipologia: 1. Preprint / submitted version [pre- review] Licenza: Pubblico - Tutti i diritti riservati Dimensione 14.96 MB Formato Adobe PDF Visualizza/Apri	14.96 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3009210

PORTO @ Archivio Istituzionale della Ricerca

Promptable image segmentation: a survey of guided input techniques

Hadi Nejabat;Federico D'Asaro;Alessandro Emmanuel Pecora;Tommaso Monopoli;Andrea Bottino

2026

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)