Plane-based 3D Mapping for Structured Indoor Environment

Yuan, Zehui

doi:10.6092/polito/porto/2506288

Three-dimensional (3D) mapping deals with the problem of building a map of the unknown environments explored by a mobile robot. In contrast to 2D maps, 3D maps contain richer information of the visited places. Besides enabling robot navigation in 3D, a 3D map of the robot surroundings could be of great importance for higher-level robotic tasks, like scene interpretation and object interaction or manipulation, as well as for visualization purposes in general, which are required in surveillance, urban search and rescue, surveying, and others. Hence, the goal of this thesis is to develop a system which is capable of reconstructing the surrounding environment of a mobile robot as a three-dimensional map. Microsoft Kinect camera is a novel sensing sensor that captures dense depth images along with RGB images at high frame rate. Recently, it has dominated the stage of 3D robotic sensing, as it is low-cost, low-power. For this work, it is used as the exteroceptive sensor and obtains 3D point clouds of the surrounding environment. Meanwhile, the wheel odometry of the robot is used to initialize the search for correspondences between different observations. As a single 3D point cloud generated by the Microsoft Kinect sensor is composed of many tens of thousands of data points, it is necessary to compress the raw data to process them efficiently. The method chosen in this work is to use a feature-based representation which simplifies the 3D mapping procedure. The chosen features are planar surfaces and orthogonal corners, which is based on the fact that indoor environments are designed such that walls, ground floors, pillars, and other major parts of the building structures can be modeled as planar surface patches, which are parallel or perpendicular to each other. While orthogonal corners are presented as higher features which are more distinguishable in indoor environment. In this thesis, the main idea is to obtain spatial constraints between pairwise frames by building correspondences between the extracted vertical plane features and corner features. A plane matching algorithm is presented that maximizes the similarity metric between a pair of planes within a search space to determine correspondences between planes. The corner matching result is based on the plane matching results. The estimated spatial constraints form the edges of a pose graph, referred to as graph-based SLAM front-end. In order to build a map, however, a robot must be able to recognize places that it has previously visited. Limitations in sensor processing problem, coupled with environmental ambiguity, make this difficult. In this thesis, we describe a loop closure detection algorithm by compressing point clouds into viewpoint feature histograms, inspired by their strong recognition ability. The estimated roto-translation between detected loop frames is added to the graph representing this newly discovered constraint. Due to the estimation errors, the estimated edges form a non-globally consistent trajectory. With the aid of a linear pose graph optimizing algorithm, the most likely configuration of the robot poses can be estimated given the edges of the graph, referred to as SLAM back-end. Finally, the 3D map is retrieved by attaching each acquired point cloud to the corresponding pose estimate. The approach is validated through different experiments with a mobile robot in an indoor environment.

PORTO @ Archivio Istituzionale della Ricerca