A fundamental ingredient in the success of deep learning for computer and robot vision is the availability of very large-scale annotated databases. ImageNet, with its 1000 object classes and 1.2 million images, tends to be the dominant data collection for creating pre-trained deep architectures. A less investigated avenue is how the possibility to create task-specific data collections on demand, with limited or without manual effort, would affect the performance of convolutional architectures. This would be useful for all those cases where contextual information about the deployment of the deep net is available, and it would be particularly relevant for robot vision applications, where such knowledge is usually available. The goal of this work is to present a protocol for the automated creation of task specific datasets starting from a pre-defined list of object classes, exploiting the Web as a source of information in an automated fashion. Our pipeline consists of (a) an algorithm for automatic Web crawling that searches for “image class seeds”, i.e., informative images of object classes of interest, (b) algorithms for figure-ground segmentation of the object of interest and pasting of the segmented item in contextual images close to where the agent is going to work, and (c) a tailored data augmentation routine for maximizing the informative content of the generated images. A thorough set of experiments on a public benchmark, as well as deployment to a robot platform, prove the value of the proposed approach

Automatic Creation of Large Scale Object Databases from Web Resources: A Case Study in Robot Vision / Molinari, Dario; Pasquale, Giulia; Natale, Lorenzo; Caputo, Barbara. - (2019), pp. 488-498. (Intervento presentato al convegno International Conference on Image Analysis and Processing) [10.1007/978-3-030-30645-8_45].

Automatic Creation of Large Scale Object Databases from Web Resources: A Case Study in Robot Vision

Caputo Barbara
2019

Abstract

A fundamental ingredient in the success of deep learning for computer and robot vision is the availability of very large-scale annotated databases. ImageNet, with its 1000 object classes and 1.2 million images, tends to be the dominant data collection for creating pre-trained deep architectures. A less investigated avenue is how the possibility to create task-specific data collections on demand, with limited or without manual effort, would affect the performance of convolutional architectures. This would be useful for all those cases where contextual information about the deployment of the deep net is available, and it would be particularly relevant for robot vision applications, where such knowledge is usually available. The goal of this work is to present a protocol for the automated creation of task specific datasets starting from a pre-defined list of object classes, exploiting the Web as a source of information in an automated fashion. Our pipeline consists of (a) an algorithm for automatic Web crawling that searches for “image class seeds”, i.e., informative images of object classes of interest, (b) algorithms for figure-ground segmentation of the object of interest and pasting of the segmented item in contextual images close to where the agent is going to work, and (c) a tailored data augmentation routine for maximizing the informative content of the generated images. A thorough set of experiments on a public benchmark, as well as deployment to a robot platform, prove the value of the proposed approach
2019
978-3-030-30645-8
978-3-030-30644-1
File in questo prodotto:
File Dimensione Formato  
Molinari2019_Chapter_AutomaticCreationOfLargeScaleO.pdf

non disponibili

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.08 MB
Formato Adobe PDF
1.08 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2785966