官术网_书友最值得收藏!

Exploring Structure from Motion Using OpenCV

In this chapter, we will discuss the notion of Structure from Motion (SfM),or better put, extracting geometric structures from images taken with a camera under motion, using OpenCV's API to help us. First, let's constrain the otherwise very b road approach to SfM using a single camera, usually called a monocular approach, and a discrete and sparse set of frames rather than a continuous video stream. These two constrains will greatly simplify the system we will sketch out in the coming pages, and help us understand the fundamentals of any SfM method. To implement our method, we will follow in the footsteps of Hartley and Zisserman (hereafter referred to as H&Z, for brevity), as documented in Chapters 9 through 12 of their seminal book Multiple View Geometry in Computer Vision.

In this chapter, we will cover the following:

  • Structure from Motion concepts
  • Estimating the camera motion from a pair of images
  • Reconstructing the scene
  • Reconstructing from many views
  • Refining the reconstruction

Throughout the chapter, we assume the use of a calibrated camera, one that was calibrated beforehand. Calibration is a ubiquitous operation in Computer Vision, fully supported in OpenCV using command-line tools, and was discussed in previous chapters. We, therefore, assume the existence of the camera's intrinsic parameters embodied in the K matrix and distortionn coefficients vector - the outputs from the calibration process.

To make things clear in terms of language, from this point on, we will refer to a camera as a single view of the scene rather than to the optics and hardware taking the image. A camera has a 3D position in space (translation) and a 3D direction of view (orientation). In general, we describe this as the 6 Degree of Freedom (DOF) camera pose, sometimes referred to as extrinsic parameters. Between two cameras, therefore, there is a 3D translation element (movement through space) and a 3D rotation of the direction of view.

We will also unify the terms for the point in the scene, world, real, or 3D to be the same thing, a point that exists in our real world. The same goes for points in an image or 2D, which are points in the image coordinates of some real 3D point that was projected on the camera sensor at that location and time.

In the chapter's code sections, you will notice references to Multiple View Geometry in Computer Vision, for example // HZ 9.12. This refers to equation number 12 of Chapter 9 of the book. Also, the text will include excerpts of code only; while the complete runnable code is included in the material accompanied with the book.

The following flow diagram describes the process in the SfM pipeline we will implement. We begin by triangulating an initial reconstructed point cloud of the scene, using 2D features matched across the image set and a calculation of two camera poses. We then add more views to the reconstruction by matching more points into the forming point cloud, calculating camera poses and triangulating their matching points. In between, we will also perform bundle adjustment to minimize the error in the reconstruction. All the steps are detailed in the next sections of this chapter, with relevant code excerpts, pointers to useful OpenCV functions, and mathematical reasoning:

主站蜘蛛池模板: 黑水县| 尼勒克县| 阿勒泰市| 寿宁县| 新竹县| 屏南县| 招远市| 章丘市| 叙永县| 盐池县| 阳泉市| 延寿县| 芦溪县| 芦山县| 宜宾县| 那坡县| 安阳市| 郴州市| 江山市| 双城市| 额敏县| 若羌县| 翁源县| 长沙市| 沙洋县| 嘉荫县| 全州县| 繁峙县| 高陵县| 巨野县| 登封市| 若尔盖县| 顺昌县| 新郑市| 萨迦县| 忻州市| 建始县| 郯城县| 罗源县| 简阳市| 武邑县|