Monocular Feature-Based Visual Odometry

2024-01-10 652 words 4 minutes

Contents

This is the final project for UIUC CS543 course. Inspired from this great post, my partner and I are determined to work on something related to visual odometry. Visual odometry is actually an another way to describe camera trajectory. The applications of which extend across various domains, serving as a foundational component for Visual Simultaneous Localization and Mapping (SLAM) systems.

Figure 1 shows the typical pipeline for 2D-2D visual odometry. We conducted experiments at each stage to assess the significance of individual components except for the optimization(future work!).

For those who are not familar with what visual odometry is or what 2D-2D means, please check my previous article for detailed introduction.

1 - Demo

2 - Problem Definition

Input

A monocular camera captures a continuous stream of grayscale images. The frames captured at time $t$ and $t+1$ can be denoted as $I_{t}$ and $I_{t+1}$ respectively. Note that we have prior knowledge for all intrinsic parameters of the camera.

Output

For every pair of frames, find the rotation matrix $\textbf{R}$ and the translation vector $\textbf{t}$, which represents the camera motion between the two frames. Note that the vector $\textbf{t}$ can only be computed up to a scale factor in the monocular scheme.

3 - Approach Outline

Receive image sequence $I_{t}, I_{t+1}$.
Feature detection
Feature matching or Feature tracking
- Feature matching: detect features in $I_{t}, I_{t+1}$ and match them.
- Feature tracking: detect features in $I_{t}$ and track those features to $I_{t+1}$ . A new detection is triggered if the number of features drop below a certain threshold.
Compute the camera pose by motion estimation.

4 - Feature Detection

The feature detection component in monocular visual odometry plays a crucial role in identifying distinctive points across subsequent frames. In this project, we explored various feature detectors including SIFT, ORB, FAST, and BRIEF. Each of these detectors has its own strengths and weaknesses. Experiment results are placed in the result section.

5 - Feature Matching / Tracking

Both feature tracking and feature matching aims to find correspondences of feature point sets across subsequent frames. With enough corresponding feature pairs, we are able to compute the essential matrix described in next section. Both is imeplemented in this project:

Feature Matching (opencv tutorial) Here we tested two matchers: Brute-Force and FLANN (Fast Library for Approximate Nearest Neighbors) implemented in OpenCV. Brute-Force matcher is straightforward. It takes the descriptor of one feature in the first set and matches it with all other features in the entire second set using a distance calculation. On the other hand, FLANN is designed for fast nearest neighbor search in large datasets with high-dimensional features.

Feature Tracking (opencv tutorial) Here we applied Lucas–Kanade method for optical flow tracking. This method operates under the assumption that the flow remains relatively constant in a local neighborhood around the pixel being considered.

6 - Motion Estimation

Motion estimation determines the movement of a camera through a sequence of images. It can actually be divided into three sub-steps:

Estimate essential matrix (detailed post) Several techniques can be used for the computation of an essential matrix. Here we implemented Nister’s 5-point and 8-point algorithm with RANSAC.
Decompose $R, t$ from the essential matrix (detailed post) Decomposition can be done by the equations:

$$ E = [t_{\times}]R = U \Sigma V^{T} = [u_{1} \ u_{2} \ u_{3}] \Sigma V^{T} $$ $$ t_{\times} = u_{3} $$ $$ R = UYV^{T} $$

Recover the visual odometry With relative transformation matrices $R, t$ between two frames, we can easily get the camera pose of new frame: $$ R_{i+1} = RR_{i} $$ $$ t_{i+1} = \text{scale} \times Rt + t_{i} $$ where scale factor we will never know in the monocular scheme. In my implementation, I just use the ground truth that KITTI dataset provides.

7 - Dataset

Kitti sequence 00 (link)

8 - Future Work

Integrate graph optimization
Superpoint