官术网_书友最值得收藏!

Pascal VOC

As previous datasets like MNIST and CIFAR are limited in representation, we cannot use them for tasks like people detection or segmentation. Pascal VOC[4] has gained in popularity for such tasks as one of the major datasets for object recognition. During 2005-2012, there were competitions conducted that used this dataset and achieved the best possible accuracy on test data. The dataset is also usually referred to by year; for example, VOC2012 refers to the dataset available for the 2012 competition. In VOC2012, there are three competition categories. The first is the classification and detection dataset, which has 20 categories of objects along with rectangular region annotations around the objects. The second category is Segmentation with instance boundaries around objects. The third competition category is for action recognition from images. 

This dataset can be downloaded from the following link:

http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html.

In this dataset, a sample annotation file (in XML format) for an image is in the following code, where the tags represent properties of that field: 

<annotation>
<folder>VOC2012</folder>
<filename>2007_000033.jpg</filename>
<source>
<database>The VOC2007 Database</database>
<annotation>PASCAL VOC2007</annotation>
<image>flickr</image>
</source>
<size>
<width>500</width>
<height>366</height>
<depth>3</depth>
</size>
<segmented>1</segmented>
<object>
<name>aeroplane</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>9</xmin>
<ymin>107</ymin>
<xmax>499</xmax>
<ymax>263</ymax>
</bndbox>
</object>
<object>
<name>aeroplane</name>
<pose>Left</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>421</xmin>
<ymin>200</ymin>
<xmax>482</xmax>
<ymax>226</ymax>
</bndbox>
</object>
<object>
<name>aeroplane</name>
<pose>Left</pose>
<truncated>1</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>325</xmin>
<ymin>188</ymin>
<xmax>411</xmax>
<ymax&amp;gt;223</ymax>
</bndbox>
</object>
</annotation>

The corresponding image is as shown in the following figure:

The available categories in this dataset are aeroplane, bicycle, boat, bottle, bus, car, cat, chair, cow, dining table, dog, horse, motorbike, person, potted plant, sheep, train, and TV. 

The number of categories is, however, limited. In the next section, we will see a more elaborate dataset with 80 categories. Having a higher number of generic object categories will help in creating applications that can be used easily in more generic scenarios.

主站蜘蛛池模板: 河曲县| 通州市| 昆山市| 绥中县| 油尖旺区| 富阳市| 开平市| 德昌县| 小金县| 息烽县| 丘北县| 万源市| 石首市| 安丘市| 上栗县| 偏关县| 宜宾县| 龙岩市| 扎鲁特旗| 武定县| 樟树市| 肥乡县| 筠连县| 乌拉特前旗| 德阳市| 霍山县| 大荔县| 瓦房店市| 信宜市| 江安县| 绥阳县| 九龙坡区| 沽源县| 巩义市| 平凉市| 右玉县| 霍邱县| 佳木斯市| 沅陵县| 武乡县| 宁河县|