官术网_书友最值得收藏!

What constitutes computer vision?

In order to begin the discussion on computer vision, observe the following image:

Even if we have never done this activity before, we can clearly tell that the image is of people skiing in the snowy mountains on a cloudy day. This information that we perceive is quite complex and can be sub divided into more basic inferences for a computer vision system.

The most basic observation that we can get from an image is of the things or objects in it. In the previous image, the various things that we can see are trees, mountains, snow, sky, people, and so on. Extracting this information is often referred to as image classification, where we would like to label an image with a predefined set of categories. In this case, the labels are the things that we see in the image. 

A wider observation that we can get from the previous image is landscape. We can tell that the image consists of Snow, Mountain, and Sky, as shown in the following image:

Although it is difficult to create exact boundaries for where the Snow, Mountain, and Sky are in the image, we can still identify approximate regions of the image for each of them. This is often termed as segmentation of an image, where we break it up into regions according to object occupancy. 

Making our observation more concrete, we can further identify the exact boundaries of objects in the image, as shown in the following figure:

In the image, we see that people are doing different activities and as such have different shapes; some are sitting, some are standing, some are skiing. Even with this many variations, we can detect objects and can create bounding boxes around them. Only a few bounding boxes are shown in the image for understanding—we can observe much more than these. 

While, in the image, we show rectangular bounding boxes around some objects, we are not categorizing what object is in the box. The next step would be to say the box contains a person. This combined observation of detecting and categorizing the box is often referred to as object detection. 

Extending our observation of people and surroundings, we can say that different people in the image have different heights, even though some are nearer and others are farther from the camera. This is due to our intuitive understanding of image formation and the relations of objects. We know that a tree is usually much taller than a person, even if the trees in the image are shorter than the people nearer to the camera. Extracting the information about geometry in the image is another sub-field of computer vision, often referred to as image reconstruction. 

主站蜘蛛池模板: 东方市| 镇康县| 昌图县| 德令哈市| 龙岩市| 慈溪市| 五大连池市| 民县| 永清县| 宜兰市| 元阳县| 西峡县| 荆门市| 清苑县| 商南县| 广东省| 霍林郭勒市| 历史| 六枝特区| 安吉县| 二手房| 河南省| 大化| 乌拉特中旗| SHOW| 潜山县| 金昌市| 安平县| 武城县| 班玛县| 原平市| 台前县| 巫溪县| 乳源| 涡阳县| 宜丰县| 乐昌市| 洛川县| 垦利县| 罗定市| 金溪县|