Fine-tuning and data augmentation techniques for semantic segmentation of heritage point clouds

Matrone, Francesca; Martini, Massimo

doi:10.4995/arqueologica9.2021.13259

This topic of this contribution falls within the broader debate on Digital Humanities. Experiencing and testing an approach that combines geomatics and its production of three-dimensional data of the built cultural heritage (CH) with information technology is the core point. In the digital CH domain, the ever-increasing availability of three-dimensional data, provides the opportunity to rapidly generate detailed 3D scenes to support restoration and conservation activities of built heritage. Concurrently, the recent research trends in geomatics are facing the issue of managing these heritage data to enrich the geometrical representation of the asset, creating a complete informative data collector. HBIM (Historic Building Information Modeling) constitutes a reference, and they typically rely on point clouds to perform the scan-to-BIM processes. These processes are still mostly manually carried out by domain experts, making the workflow very time-consuming, not fully exploiting the potential of point clouds and wasting an uncountable amount of data. In fact, parametric objects can be described through a few relevant points or contours. The use of Artificial Intelligence algorithms, in particular Deep Learning (DL) techniques, for the automatic recognition of architectural elements from point clouds can therefore provide valuable support through the semantic segmentation task. A proposal to tackle this framework was outlined in previous works, and the methodology here proposed constitutes a development of their results. Starting from those former tests obtained with the Dynamic Graph Convolutional Neural Network (DGCNN), close attention is paid to: i) transfer learning techniques, ii) the combination with external classifiers, such as Random Forest (RF), iii) the evaluation of data augmentation techniques on a domain-specific dataset (ArCH dataset). Besides, an investigation on how to make the whole workflow more functional and "friendly" for external users is carried out too. With regard to transfer learning techniques, the fine-tuning approach is proposed to understand if, also in the CH domain, it can lead to performances improvement, introducing a new scene in a pre-trained network. In fact, the peculiarities of each scene do not guarantee certain and definite results, as for other domains. This section is divided into two subsections: a classic fine-tuning and a fine-tuning with the addition of the RF in the final part of the prediction. In the latter case, the choice of adding the RF is due to the results obtained in some stateof-the-art works, where this classifier provides excellent results in a short time and even in the presence of relatively limited data. In this hybrid approach, the network weights are employed as well as in the classic fine-tuning technique. Then, the final part of the DGCNN performing the segmentation of the points is excluded, leading the network to be used as a feature extractor method; afterwards, a scene of the dataset never seen by the network is chosen and divided into one part for training and one for the test. Finally, the features of both parts are extracted, using the feature extractor, and exploited as input for training the RF classifier. Tests conducted on data augmentation show that it does not significantly affect overall performances, but still provide proper support for those categories with fewer points. On the other side, the tests on the fine-tuning have given rise to manifold considerations. Firstly, the standard fine-tuning can achieve performances almost equal to those where only the DGCNN is used, considerably improving some categories. Thus, they confirm that, once the DNN is pre-trained, data processing and prediction times can be significantly reduced (from ca. 48 to 0.5 h), in the case of heritage point clouds too. Then, performances similar to the reference tests are obtained also with the use of the DGCNN as a feature extractor and the RF as a classifier, demonstrating that the final classifier does not affect the prediction.

PORTO @ Archivio Istituzionale della Ricerca