Automating surgical suturing requires reliable computer vision systems, yet annotated real surgical datasets remain scarce, costly, and difficult to obtain. To address this challenge, we introduce a data-centric pipeline that combines synthetic data generation, generative realism boosting, and model-guided filtering to improve sim-toreal transfer without relying on real annotated surgical footage. Synthetic images were created in Unity with both type-based and part-based instruments annotations, then enhanced using CycleGAN-TURBO for unpaired imageto-image translation and Real-ESRGAN for high-resolution restoration. A YOLO-based selector model, trained on synthetic images, assessed the quality of generatively enhanced data through Dice similarity scoring, discarding samples with distortions or misalignments. In the part-based configuration, on a real test set, the baseline model trained solely on synthetic images achieved a Dice score of 0.17, while combining synthetic with unfiltered enhanced data reached 0.24. Filtering proved decisive: accepted enhanced images combined with a synthetic (hybrid curated dataset) further boosted scores to 0.44. Fine-tuning strategies yielded only marginal gains, confirming that improvements were driven primarily by data quality rather than training variations. In the typebased setup, the hybrid curated dataset achieved a mean Dice score of 0.65, a substantial improvement over previous fully synthetic baselines (0.384) without requiring real training annotations. These results demonstrate that curation of generative outputs is critical for sim-to-real transfer in surgical vision. By uniting synthetic generation, generative realism, and automated filtering, this pipeline enables scalable, low-cost dataset creation, providing resources on GitHub and a reproducible foundation for developing reliable perception systems and advancing autonomy in surgical robotics.

Generative AI Pipeline with Model-Guided Filtering for Sim-to-Real Transfer in Surgical Imaging / Leoncini, Pietro; Marzola, Francesco; Pescio, Matteo; Muratore, Luigi; Revello, Lorenzo; Barontini, Federica; Distefano, Giovanni; Hayashi, Kengo; Ammirati, Carlo Alberto; Arezzo, Alberto; Dagnino, Giulio. - In: COMPUTERIZED MEDICAL IMAGING AND GRAPHICS. - ISSN 0895-6111. - 132:(2026). [10.1016/j.compmedimag.2026.102775]

Generative AI Pipeline with Model-Guided Filtering for Sim-to-Real Transfer in Surgical Imaging

Leoncini, Pietro;Pescio, Matteo;Distefano, Giovanni;
2026

Abstract

Automating surgical suturing requires reliable computer vision systems, yet annotated real surgical datasets remain scarce, costly, and difficult to obtain. To address this challenge, we introduce a data-centric pipeline that combines synthetic data generation, generative realism boosting, and model-guided filtering to improve sim-toreal transfer without relying on real annotated surgical footage. Synthetic images were created in Unity with both type-based and part-based instruments annotations, then enhanced using CycleGAN-TURBO for unpaired imageto-image translation and Real-ESRGAN for high-resolution restoration. A YOLO-based selector model, trained on synthetic images, assessed the quality of generatively enhanced data through Dice similarity scoring, discarding samples with distortions or misalignments. In the part-based configuration, on a real test set, the baseline model trained solely on synthetic images achieved a Dice score of 0.17, while combining synthetic with unfiltered enhanced data reached 0.24. Filtering proved decisive: accepted enhanced images combined with a synthetic (hybrid curated dataset) further boosted scores to 0.44. Fine-tuning strategies yielded only marginal gains, confirming that improvements were driven primarily by data quality rather than training variations. In the typebased setup, the hybrid curated dataset achieved a mean Dice score of 0.65, a substantial improvement over previous fully synthetic baselines (0.384) without requiring real training annotations. These results demonstrate that curation of generative outputs is critical for sim-to-real transfer in surgical vision. By uniting synthetic generation, generative realism, and automated filtering, this pipeline enables scalable, low-cost dataset creation, providing resources on GitHub and a reproducible foundation for developing reliable perception systems and advancing autonomy in surgical robotics.
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0895611126000789-main.pdf

accesso aperto

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Creative commons
Dimensione 2.77 MB
Formato Adobe PDF
2.77 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3011040