Images and matrices
The most important structure in computer vision is, without doubt, the images. The image in a computer vision is the representation of the physical world captured with a digital device. This picture is only a sequence of numbers stored in a matrix format (refer to the following diagram). Each number is a measurement of the light intensity for the considered wavelength (for example, red, green, or blue in color images) or for a wavelength range (for panchromatic devices). Every point in an image is called a pixel (for a picture element), and each pixel can store one or more values depending on whether it is a black and white image (also referred to as a binary image) that stores only one value, such as 0 or 1, a grayscale-level image that stores two values, or a color image that stores three values. These values are usually between 0 and 255 in an integer number, but you can use other ranges, for example 0 to 1 in floating point numbers, as in high dynamic range imaging (HDRI) or thermal images:

The image is stored in a matrix format, where each pixel has a position in it and can be referenced by the number of the column and row. OpenCV uses the Mat class for this purpose. In the case of a grayscale image, a single matrix is used, as demonstrated in the following diagram:

In the case of a color image, such as the following diagram, we use a matrix of width x height x the number of color channels:

But the Mat class is not only for storing images; it also enables you to store any type of matrix and different sizes. You can use it as an algebraic matrix and perform operations with it. In the following sections, we are going to describe the most important matrix operations, such as addition, multiplication, diagonalization. But, before that, it's important to know how the matrix is stored internally in the computer memory, because it is always more efficient to access the memory slots instead of accessing each pixel with the OpenCV functions.
In memory, the matrix is saved as an array or sequence of values ordered by columns and rows. The following table shows the sequence of pixels in BGR image format:

With this order, we can access any pixel by observing the following formula:
Value= Row_i*num_cols*num_channels + Col_i + channel_i
- Live Longer with AI
- Hadoop大數(shù)據(jù)實(shí)戰(zhàn)權(quán)威指南(第2版)
- Python數(shù)據(jù)分析:基于Plotly的動(dòng)態(tài)可視化繪圖
- 大數(shù)據(jù)Hadoop 3.X分布式處理實(shí)戰(zhàn)
- 數(shù)亦有道:Python數(shù)據(jù)科學(xué)指南
- Scratch 3.0 藝術(shù)進(jìn)階
- Lego Mindstorms EV3 Essentials
- 大數(shù)據(jù)架構(gòu)商業(yè)之路:從業(yè)務(wù)需求到技術(shù)方案
- SQL應(yīng)用及誤區(qū)分析
- 數(shù)據(jù)庫原理與應(yīng)用
- 探索新型智庫發(fā)展之路:藍(lán)迪國際智庫報(bào)告·2015(上冊(cè))
- 數(shù)據(jù)修復(fù)技術(shù)與典型實(shí)例實(shí)戰(zhàn)詳解(第2版)
- Mastering ROS for Robotics Programming(Second Edition)
- 離線和實(shí)時(shí)大數(shù)據(jù)開發(fā)實(shí)戰(zhàn)
- Unity Game Development Blueprints