Human-made environments tend to be structured, i.e., composed mainly of planar surfaces, whose majority are either parallel or perpendicular to each other. Furthermore, those environments lack texture. Traditional methods for Mapping and Localization rely on point features. However, when applying those methods to structured environments, the density of point features may be insufficient for the accuracy requirements. In this thesis, we propose to exploit line features. These are more common in human-made environments since they rise from the intersection of planar surfaces. Moreover, compared with planes, they are easier to detect (usually associated with edge detection).

Another characteristic of traditional mapping methods is their passive nature. The map is computed from a set of images (sequential or not), without much attention given to the path the camera takes. By doing this, the mapping method is susceptible to fall in singular configurations. To address this issue, active methods have been used. Those consist of a state observer to retrieve 3D data and a control law that optimizes the estimation and keeps the system observable. Thus, the first contribution of this thesis is the development of Active Estimation methods using line features.

The presented Active Estimation methods do not account for model and measurement noise. We propose to exploit Moving Horizon Observers (MHO) to estimate the 3D information of line image features to tackle this issue. To the best of our knowledge, this is the first application of MHO to the mapping of visual features. The proposed method’s stability is assessed and compared with the previous observers, showing more robustness to measurement noise.

Both the MHO and the Active Estimation methods consider the estimation of a single line. Thus, we require as many observers to map multiple lines, leading to at least $4N$ state variables, with $N$ being the number of lines. Since lines are common in structured environments, we can exploit that structure to reduce the state space. The modeling of structured environments is proposed in this thesis. It reduces the state space to $3N + 3$ and is also less susceptible to singular configurations.

An assumption the previous methods make is that the camera velocity is available at all times. However, the velocity is usually retrieved from odometry, which is noisy. With this in mind, we propose coupling the camera with an Inertial Measurement Unit (IMU) and an observer cascade, with a first observer, which retrieves the scale of the linear velocity, and a second observer, which is similar to the ones presented so far.

Even though the focus has been on mapping line features from perspective cameras, we are also interested in the localization problem, particularly the 3D Registration problem. Like in Active Estimation, most works focus on point features, which may be hard to obtain from 3D point clouds, especially in sparse clouds, where point correspondences are not guaranteed to exist. Thus, the final contribution of this thesis is five minimal solvers and a Hybrid RANSAC loop to estimate the relative pose between point clouds using point and plane correspondences and line intersection constraints. The method is shown to have state-of-the-art performance in two publicly available datasets.