書名： Mastering OpenCV 3（Second Edition）
作者名： Daniel Lélis Baggio Shervin Emami David Millán Escrivá Khvedchenia Ievgen Jason Saragih Roy Shilkrot
本章字數： 693字
更新時間： 2021-07-02 23:29:12

Refinement of the reconstruction

One of the most important parts of an SfM method is refining and optimizing the reconstructed scene, also known as the process of Bundle Adjustment (BA). This is an optimization step where all the data we gathered is fitted to a monolithic model. Both the position of the recovered 3D points and the positions of the cameras are optimized, so re-projection errors are minimized. In other words, recovered 3D points that are re-projected on the image are expected to lie close to the position of originating 2D feature points that generated them. The BA process we use will try to minimize this error for all 3D points together, making for a very big system of simultaneous linear equations with on the order of thousands of parameters.

We will implement a BA algorithm using the Ceres library, a well-known optimization package from Google. Ceres has built-in tools to help with BA, such as automatic differentiation and many flavors of linear and nonlinear optimization schemes, which result in less code and more flexibility.

To make things simple and easy to implement, we will make a few assumptions, whereas in a real SfM system, these things cannot be neglected. Firstly, we will assume a simple intrinsic model for our cameras, specifically that the focal length in x and y is the same and the center of projection is exactly the middle of the image. We further assume that all cameras share the same intrinsic parameters, meaning that the same camera takes all the images in the bundle with the exact configuration (for example, zoom). These assumptions greatly reduce the number of parameters to optimize, which in turn makes the optimization not only easier to code but also faster to converge.

To start, we will model the error function, sometimes also called the cost function, which is, simply put, the way the optimization knows how good the new parameters are and also which way to go to get even better parameters. We can write the following functor that makes use of Ceres' Auto Differentiation mechanism:

    // The pinhole camera is parameterized using 7 parameters: 
    // 3 for rotation, 3 for translation, 1 for focal length. 
    // The principal point is not modeled (assumed be located at the 
    // image center, and already subtracted from 'observed'),  
    // and focal_x = focal_y. 
    struct SimpleReprojectionError { 
      using namespace ceres; 

      SimpleReprojectionError(double observed_x, double observed_y) : 
      observed_x(observed_x), observed_y(observed_y) {} 

      template<typenameT> 
      bool operator()(const T* const camera,  
                      const T* const point, 
                      const T* const focal, 
                      T* residuals) const { 
        T p[3]; 
        // Rotate: camera[0,1,2] are the angle-axis rotation. 
        AngleAxisRotatePoint(camera, point, p); 

        // Translate: camera[3,4,5] are the translation. 
        p[0] += camera[3]; 
        p[1] += camera[4]; 
        p[2] += camera[5]; 

        // Perspective pide 
        const T xp = p[0] / p[2]; 
        const T yp = p[1] / p[2]; 

        // Compute projected point position (sans center of 
        // projection) 
        const T predicted_x = *focal * xp; 
        const T predicted_y = *focal * yp; 

        // The error is the difference between the predicted  
        // and observed position. 
        residuals[0] = predicted_x - T(observed_x); 
        residuals[1] = predicted_y - T(observed_y); 
        return true; 
      } 

      // A helper construction function 
      static CostFunction* Create(const double observed_x,  
      const double observed_y) { 
        return (newAutoDiffCostFunction<SimpleReprojectionError,  
        2, 6, 3, 1>( 
        newSimpleReprojectionError(observed_x,  
        observed_y))); 
      } 
      double observed_x; 
      double observed_y; 
    };

This functor calculates the deviation a 3D point has from its originating 2D point by re-projecting it using simplified extrinsic and intrinsic camera parameters. The error in x and y is saved as the residual, which guides the optimization.

There is quite a bit of additional code that goes into the BA implementation, but it primarily handles bookkeeping of cloud 3D points, originating 2D points, and their respective cameras. The readers may wish to review how this is done in the code attached to the book.

The following image shows the effects of BA. The two images on the left are the points of the point cloud before adjustment from two perspectives, and the images on the right show the optimized cloud. The change is quite dramatic, and many misalignments between points triangulated from different views are now mostly consolidated. We can also notice how the adjustment created a far better reconstruction of flat surfaces:

官术网_书友最值得收藏!

Mastering OpenCV 3（Second Edition）

Refinement of the reconstruction