The 6D pose estimation of an object from an image is a central problem in many domains of Computer Vision (CV) and researchers have struggled with this issue for several years. Traditional pose estimation methods (1) leveraged on geometrical approaches, exploiting manually annotated local features, or (2) relied on 2D object representations from different points of view and their comparisons with the original image. The two methods mentioned above are also known as Feature-based and Template-based, respectively. With the diffusion of Deep Learning (DL), new Learning-based strategies have been introduced to achieve the 6D pose estimation, improving traditional methods by involving Convolutional Neural Networks (CNN). This review analyzed techniques belonging to different research fields and classified them into three main categories: Template-based methods, Feature-based methods, and Learning-Based methods. In recent years, the research mainly focused on Learning-based methods, which allow the training of a neural network tailored for a specific task. For this reason, most of the analyzed methods belong to this category, and they have been in turn classified into three sub-categories: Bounding box prediction and Perspective-n-Point (PnP) algorithm-based methods, Classification-based methods, and Regression-based methods. This review aims to provide a general overview of the latest 6D pose recovery methods to underline the pros and cons and highlight the best-performing techniques for each group. The main goal is to supply the readers with helpful guidelines for the implementation of performing applications even under challenging circumstances such as auto-occlusions, symmetries, occlusions between multiple objects, and bad lighting conditions.
6D object position estimation from 2D images: a literature review / Marullo, Giorgia; Tanzi, Leonardo; Piazzolla, Pietro; Vezzetti, Enrico. - In: MULTIMEDIA TOOLS AND APPLICATIONS. - ISSN 1573-7721. - 82:(2023), pp. 24605-24643. [10.1007/s11042-022-14213-z]
6D object position estimation from 2D images: a literature review
Giorgia Marullo;Leonardo Tanzi;Pietro Piazzolla;Enrico Vezzetti
2023
Abstract
The 6D pose estimation of an object from an image is a central problem in many domains of Computer Vision (CV) and researchers have struggled with this issue for several years. Traditional pose estimation methods (1) leveraged on geometrical approaches, exploiting manually annotated local features, or (2) relied on 2D object representations from different points of view and their comparisons with the original image. The two methods mentioned above are also known as Feature-based and Template-based, respectively. With the diffusion of Deep Learning (DL), new Learning-based strategies have been introduced to achieve the 6D pose estimation, improving traditional methods by involving Convolutional Neural Networks (CNN). This review analyzed techniques belonging to different research fields and classified them into three main categories: Template-based methods, Feature-based methods, and Learning-Based methods. In recent years, the research mainly focused on Learning-based methods, which allow the training of a neural network tailored for a specific task. For this reason, most of the analyzed methods belong to this category, and they have been in turn classified into three sub-categories: Bounding box prediction and Perspective-n-Point (PnP) algorithm-based methods, Classification-based methods, and Regression-based methods. This review aims to provide a general overview of the latest 6D pose recovery methods to underline the pros and cons and highlight the best-performing techniques for each group. The main goal is to supply the readers with helpful guidelines for the implementation of performing applications even under challenging circumstances such as auto-occlusions, symmetries, occlusions between multiple objects, and bad lighting conditions.File | Dimensione | Formato | |
---|---|---|---|
s11042-022-14213-z.pdf
accesso aperto
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Creative commons
Dimensione
3.25 MB
Formato
Adobe PDF
|
3.25 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2975909