- OpenCV:Computer Vision Projects with Python
- Joseph Howse Prateek Joshi Michael Beyeler
- 3657字
- 2021-07-08 10:52:16
Chapter 2. Handling Files, Cameras, and GUIs
This chapter introduces OpenCV's I/O functionality. We also discuss a project concept and the beginnings of an object-oriented design for this project, which we will flesh out in subsequent chapters.
By starting with a look at I/O capabilities and design patterns, we are building our project in the same way we would make a sandwich: from the outside in. Bread slices and spread or endpoints and glue, come before fillings or algorithms. We choose this approach because computer vision is extroverted—it contemplates the real world outside our computer—and we want to apply all our subsequent, algorithmic work to the real world through a common interface.
Note
All the finished code for this chapter can be downloaded from my website: http://nummist.com/opencv/3923_02.zip.
Basic I/O scripts
All CV applications need to get images as input. Most also need to produce images as output. An interactive CV application might require a camera as an input source and a window as a output destination. However, other possible sources and destinations include image files, video files, and raw bytes. For example, raw bytes might be received/sent via a network connection or might be generated by an algorithm if we are incorporating procedural graphics into our application. Let's look at each of these possibilities.
Reading/Writing an image file
OpenCV provides the imread()
and imwrite()
functions that support various file formats for still images. The supported formats vary by system but should always include the BMP format. Typically, PNG, JPEG, and TIFF should be among the supported formats too. Images can be loaded from one file format and saved to another. For example, let's convert an image from PNG to JPEG:
import cv2 image = cv2.imread('MyPic.png') cv2.imwrite('MyPic.jpg', image)
Note
Most of the OpenCV functionality that we use is in the cv2
module. You might come across other OpenCV guides that instead rely on the cv
or cv2.cv
modules, which are legacy versions. We do use cv2.cv
for certain constants that are not yet redefined in cv2
.
By default, imread()
returns an image in BGR color format, even if the file uses a grayscale format. BGR (blue-green-red) represents the same color space as RGB (red-green-blue) but the byte order is reversed.
Optionally, we may specify the mode of imread()
to be CV_LOAD_IMAGE_COLOR
(BGR), CV_LOAD_IMAGE_GRAYSCALE
(grayscale), or CV_LOAD_IMAGE_UNCHANGED
(either BGR or grayscale, depending on the file's color space). For example, let's load a PNG as a grayscale image (losing any color information in the process) and, then, save it as a grayscale PNG image:
import cv2 grayImage = cv2.imread('MyPic.png', cv2.CV_LOAD_IMAGE_GRAYSCALE) cv2.imwrite('MyPicGray.png', grayImage)
Regardless of the mode, imread()
discards any alpha channel (transparency). The imwrite()
function requires an image to be in BGR or grayscale format with a number of bits per channel that the output format can support. For example, bmp
requires 8 bits per channel while PNG allows either 8 or 16 bits per channel.
Converting between an image and raw bytes
Conceptually, a byte is an integer ranging from 0 to 255. Throughout real-time graphics applications today, a pixel is typically represented by one byte per channel, though other representations are also possible.
An OpenCV image is a 2D or 3D array of type numpy.array
. An 8-bit grayscale image is a 2D array containing byte values. A 24-bit BGR image is a 3D array, also containing byte values. We may access these values by using an expression like image[0, 0]
or image[0, 0, 0]
. The first index is the pixel's y
coordinate, or row, 0
being the top. The second index is the pixel's x
coordinate, or column, 0
being the leftmost. The third index (if applicable) represents a color channel.
For example, in an 8-bit grayscale image with a white pixel in the upper-left corner, image[0, 0]
is 255
. For a 24-bit BGR image with a blue pixel in the upper-left corner, image[0, 0]
is [255, 0, 0]
.
Note
As an alternative to using an expression like image[0, 0]
or image[0, 0] = 128
, we may use an expression like image.item((0, 0))
or image.setitem((0, 0), 128)
. The latter expressions are more efficient for single-pixel operations. However, as we will see in subsequent chapters, we usually want to perform operations on large slices of an image rather than single pixels.
Provided that an image has 8 bits per channel, we can cast it to a standard Python bytearray
, which is one-dimensional:
byteArray = bytearray(image)
Conversely, provided that bytearray
contains bytes in an appropriate order, we can cast and then reshape it to get a numpy.array
type that is an image:
grayImage = numpy.array(grayByteArray).reshape(height, width) bgrImage = numpy.array(bgrByteArray).reshape(height, width, 3)
As a more complete example, let's convert bytearray
containing random bytes to a grayscale image and a BGR image:
import cv2 import numpy import os # Make an array of 120,000 random bytes. randomByteArray = bytearray(os.urandom(120000)) flatNumpyArray = numpy.array(randomByteArray) # Convert the array to make a 400x300 grayscale image. grayImage = flatNumpyArray.reshape(300, 400) cv2.imwrite('RandomGray.png', grayImage) # Convert the array to make a 400x100 color image. bgrImage = flatNumpyArray.reshape(100, 400, 3) cv2.imwrite('RandomColor.png', bgrImage)
After running this script, we should have a pair of randomly generated images, RandomGray.png
and RandomColor.png
, in the script's directory.
Note
Here, we use Python's standard os.urandom()
function to generate random raw bytes, which we then convert to a Numpy array. Note that it is also possible to generate a random Numpy array directly (and more efficiently) using a statement such as numpy.random.randint(0, 256, 120000).reshape(300, 400)
. The only reason we are using os.urandom()
is to help demonstrate conversion from raw bytes.
Reading/Writing a video file
OpenCV provides the VideoCapture
and VideoWriter
classes that support various video file formats. The supported formats vary by system but should always include AVI. Via its read()
method, a VideoCapture
class may be polled for new frames until reaching the end of its video file. Each frame is an image in BGR format. Conversely, an image may be passed to the write()
method of the VideoWriter
class, which appends the image to the file in VideoWriter
. Let's look at an example that reads frames from one AVI file and writes them to another AVI file with YUV encoding:
import cv2 videoCapture = cv2.VideoCapture('MyInputVid.avi') fps = videoCapture.get(cv2.cv.CV_CAP_PROP_FPS) size = (int(videoCapture.get(cv2.cv.CV_CAP_PROP_FRAME_WIDTH)), int(videoCapture.get(cv2.cv.CV_CAP_PROP_FRAME_HEIGHT))) videoWriter = cv2.VideoWriter( 'MyOutputVid.avi', cv2.cv.CV_FOURCC('I','4','2','0'), fps, size) success, frame = videoCapture.read() while success: # Loop until there are no more frames. videoWriter.write(frame) success, frame = videoCapture.read()
The arguments to VideoWriter
class' constructor deserve special attention. The video's filename must be specified. Any preexisting file with that name is overwritten. A video codec must also be specified. The available codecs may vary from system to system. Options include:
cv2.cv.CV_FOURCC('I','4','2','0')
: This is an uncompressed YUV, 4:2:0 chroma subsampled. This encoding is widely compatible but produces large files. The file extension should beavi
.cv2.cv.CV_FOURCC('P','I','M','1')
: This is MPEG-1. The file extension should beavi
.cv2.cv.CV_FOURCC('M','J','P','G')
: This is motion-JPEG. The file extension should beavi
.cv2.cv.CV_FOURCC('T','H','E','O')
: This is Ogg-Vorbis. The file extension should beogv
.cv2.cv.CV_FOURCC('F','L','V','1')
: This is Flash video. The file extension should beflv
.
A frame rate and frame size must be specified, too. Since we are copying from another video, these properties can be read from our get()
method of the VideoCapture
class.
Capturing camera frames
A stream of camera frames is represented by the VideoCapture
class, too. However, for a camera, we construct a VideoCapture
class by passing the camera's device index instead of a video's filename. Let's consider an example that captures 10 seconds of video from a camera and writes it to an AVI file:
import cv2 cameraCapture = cv2.VideoCapture(0) fps = 30 # an assumption size = (int(cameraCapture.get(cv2.cv.CV_CAP_PROP_FRAME_WIDTH)), int(cameraCapture.get(cv2.cv.CV_CAP_PROP_FRAME_HEIGHT))) videoWriter = cv2.VideoWriter( 'MyOutputVid.avi', cv2.cv.CV_FOURCC('I','4','2','0'), fps, size) success, frame = cameraCapture.read() numFramesRemaining = 10 * fps - 1 while success and numFramesRemaining > 0: videoWriter.write(frame) success, frame = cameraCapture.read() numFramesRemaining -= 1
Unfortunately, the get()
method of a VideoCapture
class does not return an accurate value for the camera's frame rate; it always returns 0
. For the purpose of creating an appropriate VideoWriter
class for the camera, we have to either make an assumption about the frame rate (as we did in the code previously) or measure it using a timer. The latter approach is better and we will cover it later in this chapter.
The number of cameras and their ordering is of course system-dependent. Unfortunately, OpenCV does not provide any means of querying the number of cameras or their properties. If an invalid index is used to construct a VideoCapture
class, the VideoCapture
class will not yield any frames; its read()
method will return (false, None)
.
The read()
method is inappropriate when we need to synchronize a set of cameras or a multi-head camera (such as a stereo camera or a Kinect). Then, we use the grab()
and retrieve()
methods instead. For a set of cameras:
success0 = cameraCapture0.grab() success1 = cameraCapture1.grab() if success0 and success1: frame0 = cameraCapture0.retrieve() frame1 = cameraCapture1.retrieve()
For a multi-head camera, we must specify a head's index as an argument to retrieve()
:
success = multiHeadCameraCapture.grab() if success: frame0 = multiHeadCameraCapture.retrieve(channel = 0) frame1 = multiHeadCameraCapture.retrieve(channel = 1)
We will study multi-head cameras in more detail in Chapter 5, Detecting Foreground/Background Regions and Depth.
Displaying camera frames in a window
OpenCV allows named windows to be created, redrawn, and destroyed using the namedWindow()
, imshow()
, and destroyWindow()
functions. Also, any window may capture keyboard input via the waitKey()
function and mouse input via the setMouseCallback()
function. Let's look at an example where we show frames of live camera input:
import cv2 clicked = False def onMouse(event, x, y, flags, param): global clicked if event == cv2.cv.CV_EVENT_LBUTTONUP: clicked = True cameraCapture = cv2.VideoCapture(0) cv2.namedWindow('MyWindow') cv2.setMouseCallback('MyWindow', onMouse) print 'Showing camera feed. Click window or press any key to stop.' success, frame = cameraCapture.read() while success and cv2.waitKey(1) == -1 and not clicked: cv2.imshow('MyWindow', frame) success, frame = cameraCapture.read() cv2.destroyWindow('MyWindow')
The argument to waitKey()
is a number of milliseconds to wait for keyboard input. The return value is either -1
(meaning no key has been pressed) or an ASCII keycode, such as 27
for Esc. For a list of ASCII keycodes, see standard function, ord()
, which can convert a character to its ASCII keycode. For example, ord('a') returns 97
.
Tip
On some systems, waitKey()
may return a value that encodes more than just the ASCII keycode. (A bug is known to occur on Linux when OpenCV uses GTK as its backend GUI library.) On all systems, we can ensure that we extract just the ASCII keycode by reading the last byte from the return value, like this:
keycode = cv2.waitKey(1) if keycode != -1: keycode &= 0xFF
OpenCV's window functions and waitKey()
are interdependent. OpenCV windows are only updated when waitKey()
is called, and waitKey()
only captures input when an OpenCV window has focus.
The mouse callback passed to setMouseCallback()
should take five arguments, as seen in our code sample. The callback's param
argument is set as an optional third argument to setMouseCallback()
. By default, it is 0
. The callback's event
argument is one of the following:
cv2.cv.CV_EVENT_MOUSEMOVE
: Mouse movementcv2.cv.CV_EVENT_LBUTTONDOWN
: Left button downcv2.cv.CV_EVENT_RBUTTONDOWN
: Right button downcv2.cv.CV_EVENT_MBUTTONDOWN
: Middle button downcv2.cv.CV_EVENT_LBUTTONUP
: Left button upcv2.cv.CV_EVENT_RBUTTONUP
: Right button upcv2.cv.CV_EVENT_MBUTTONUP
: Middle button upcv2.cv.CV_EVENT_LBUTTONDBLCLK
: Left button double-clickcv2.cv.CV_EVENT_RBUTTONDBLCLK
: Right button double-clickcv2.cv.CV_EVENT_MBUTTONDBLCLK
: Middle button double-click
The mouse callback's flags
argument may be some bitwise combination of the following:
cv2.cv.CV_EVENT_FLAG_LBUTTON
: The left button pressedcv2.cv.CV_EVENT_FLAG_RBUTTON
: The right button pressedcv2.cv.CV_EVENT_FLAG_MBUTTON
: The middle button pressedcv2.cv.CV_EVENT_FLAG_CTRLKEY
: The Ctrl key pressedcv2.cv.CV_EVENT_FLAG_SHIFTKEY
: The Shift key pressedcv2.cv.CV_EVENT_FLAG_ALTKEY
: The Alt key pressed
Unfortunately, OpenCV does not provide any means of handling window events. For example, we cannot stop our application when the window's close button is clicked. Due to OpenCV's limited event handling and GUI capabilities, many developers prefer to integrate it with another application framework. Later in this chapter, we will design an abstraction layer to help integrate OpenCV into any application framework.
Project concept
OpenCV is often studied through a cookbook approach that covers a lot of algorithms but nothing about high-level application development. To an extent, this approach is understandable because OpenCV's potential applications are so diverse. For example, we could use it in a photo/video editor, a motion-controlled game, a robot's AI, or a psychology experiment where we log participants' eye movements. Across such different use cases, can we truly study a useful set of abstractions?
I believe we can and the sooner we start creating abstractions, the better. We will structure our study of OpenCV around a single application, but, at each step, we will design a component of this application to be extensible and reusable.
We will develop an interactive application that performs face tracking and image manipulations on camera input in real time. This type of application covers a broad range of OpenCV's functionality and challenges us to create an efficient, effective implementation. Users would immediately notice flaws, such as a low frame rate or inaccurate tracking. To get the best results, we will try several approaches using conventional imaging and depth imaging.
Specifically, our application will perform real-time facial merging. Given two streams of camera input (or, optionally, prerecorded video input), the application will superimpose faces from one stream atop faces in the other. Filters and distortions will be applied to give the blended scene a unified look and feel. Users should have the experience of being engaged in a live performance where they enter another environment and another persona. This type of user experience is popular in amusement parks such as Disneyland.
We will call our application Cameo. A cameo is (in jewelry) a small portrait of a person or (in film) a very brief role played by a celebrity.
An object-oriented design
Python applications can be written in a purely procedural style. This is often done with small applications like our basic I/O scripts, discussed previously. However, from now on, we will use an object-oriented style because it promotes modularity and extensibility.
From our overview of OpenCV's I/O functionality, we know that all images are similar, regardless of their source or destination. No matter how we obtain a stream of images or where we send it as output, we can apply the same application-specific logic to each frame in this stream. Separation of I/O code and application code becomes especially convenient in an application like Cameo, which uses multiple I/O streams.
We will create classes called CaptureManager
and WindowManager
as high-level interfaces to I/O streams. Our application code may use a CaptureManager
to read new frames and, optionally, to dispatch each frame to one or more outputs, including a still image file, a video file, and a window (via a WindowManager
class). A WindowManager
class lets our application code handle a window and events in an object-oriented style.
Both CaptureManager
and WindowManager
are extensible. We could make implementations that did not rely on OpenCV for I/O. Indeed, Appendix A, Integrating with Pygame uses a WindowManager
subclass.
Abstracting a video stream – managers.CaptureManager
As we have seen, OpenCV can capture, show, and record a stream of images from either a video file or a camera, but there are some special considerations in each case. Our CaptureManager
class abstracts some of the differences and provides a higher-level interface for dispatching images from the capture stream to one or more outputs—a still image file, a video file, or a window.
A CaptureManager
class is initialized with a VideoCapture
class and has the enterFrame()
and exitFrame()
methods that should typically be called on every iteration of an application's main loop. Between a call to enterFrame()
and a call to exitFrame()
, the application may (any number of times) set a channel
property and get a frame
property. The channel
property is initially 0
and only multi-head cameras use other values. The frame
property is an image corresponding to the current channel's state when enterFrame()
was called.
A CaptureManager
class also has writeImage()
, startWritingVideo()
, and stopWritingVideo()
methods that may be called at any time. Actual file writing is postponed until exitFrame()
. Also during the exitFrame()
method, the frame
property may be shown in a window, depending on whether the application code provides a WindowManager
class either as an argument to the constructor of CaptureManager
or by setting a property, previewWindowManager
.
If the application code manipulates frame
, the manipulations are reflected in any recorded files and in the window. A CaptureManager
class has a constructor argument and a property called shouldMirrorPreview
, which should be True
if we want frame
to be mirrored (horizontally flipped) in the window but not in recorded files. Typically, when facing a camera, users prefer the live camera feed to be mirrored.
Recall that a VideoWriter
class needs a frame rate, but OpenCV does not provide any way to get an accurate frame rate for a camera. The CaptureManager
class works around this limitation by using a frame counter and Python's standard time.time()
function to estimate the frame rate if necessary. This approach is not foolproof. Depending on frame rate fluctuations and the system-dependent implementation of time.time()
, the accuracy of the estimate might still be poor in some cases. However, if we are deploying to unknown hardware, it is better than just assuming that the user's camera has a particular frame rate.
Let's create a file called managers.py
, which will contain our implementation of CaptureManager
. The implementation turns out to be quite long. So, we will look at it in several pieces. First, let's add imports, a constructor, and properties, as follows:
import cv2 import numpy import time class CaptureManager(object): def __init__(self, capture, previewWindowManager = None, shouldMirrorPreview = False): self.previewWindowManager = previewWindowManager self.shouldMirrorPreview = shouldMirrorPreview self._capture = capture self._channel = 0 self._enteredFrame = False self._frame = None self._imageFilename = None self._videoFilename = None self._videoEncoding = None self._videoWriter = None self._startTime = None self._framesElapsed = long(0) self._fpsEstimate = None @property def channel(self): return self._channel @channel.setter def channel(self, value): if self._channel != value: self._channel = value self._frame = None @property def frame(self): if self._enteredFrame and self._frame is None: _, self._frame = self._capture.retrieve(channel = self.channel) return self._frame @property def isWritingImage (self): return self._imageFilename is not None @property def isWritingVideo(self): return self._videoFilename is not None
Note that most of the member variables are non-public, as denoted by the underscore prefix in variable names, such as self._enteredFrame
. These non-public variables relate to the state of the current frame and any file writing operations. As previously discussed, application code only needs to configure a few things, which are implemented as constructor arguments and settable public properties: the camera channel, the window manager, and the option to mirror the camera preview.
Note
By convention, in Python, variables that are prefixed with a single underscore should be treated as protected (accessed only within the class and its subclasses), while variables that are prefixed with a double underscore should be treated as private (accessed only within the class).
Continuing with our implementation, let's add the enterFrame()
and exitFrame()
methods to managers.py
:
def enterFrame(self): """Capture the next frame, if any.""" # But first, check that any previous frame was exited. assert not self._enteredFrame, \ 'previous enterFrame() had no matching exitFrame()' if self._capture is not None: self._enteredFrame = self._capture.grab() def exitFrame (self): """Draw to the window. Write to files. Release the frame.""" # Check whether any grabbed frame is retrievable. # The getter may retrieve and cache the frame. if self.frame is None: self._enteredFrame = False return # Update the FPS estimate and related variables. if self._framesElapsed == 0: self._startTime = time.time() else: timeElapsed = time.time() - self._startTime self._fpsEstimate = self._framesElapsed / timeElapsed self._framesElapsed += 1 # Draw to the window, if any. if self.previewWindowManager is not None: if self.shouldMirrorPreview: mirroredFrame = numpy.fliplr(self._frame).copy() self.previewWindowManager.show(mirroredFrame) else: self.previewWindowManager.show(self._frame) # Write to the image file, if any. if self.isWritingImage: cv2.imwrite(self._imageFilename, self._frame) self._imageFilename = None # Write to the video file, if any. self._writeVideoFrame() # Release the frame. self._frame = None self._enteredFrame = False
Note that the implementation of enterFrame()
only grabs (synchronizes) a frame, whereas actual retrieval from a channel is postponed to a subsequent reading of the frame
variable. The implementation of exitFrame()
takes the image from the current channel, estimates a frame rate, shows the image via the window manager (if any), and fulfills any pending requests to write the image to files.
Several other methods also pertain to file writing. To finish our class implementation, let's add the remaining file-writing methods to managers.py
:
def writeImage(self, filename): """Write the next exited frame to an image file.""" self._imageFilename = filename def startWritingVideo( self, filename, encoding = cv2.cv.CV_FOURCC('I','4','2','0')): """Start writing exited frames to a video file.""" self._videoFilename = filename self._videoEncoding = encoding def stopWritingVideo (self): """Stop writing exited frames to a video file.""" self._videoFilename = None self._videoEncoding = None self._videoWriter = None def _writeVideoFrame(self): if not self.isWritingVideo: return if self._videoWriter is None: fps = self._capture.get(cv2.cv.CV_CAP_PROP_FPS) if fps == 0.0: # The capture's FPS is unknown so use an estimate. if self._framesElapsed < 20: # Wait until more frames elapse so that the # estimate is more stable. return else: fps = self._fpsEstimate size = (int(self._capture.get( cv2.cv.CV_CAP_PROP_FRAME_WIDTH)), int(self._capture.get( cv2.cv.CV_CAP_PROP_FRAME_HEIGHT))) self._videoWriter = cv2.VideoWriter( self._videoFilename, self._videoEncoding, fps, size) self._videoWriter.write(self._frame)
The public methods, writeImage()
, startWritingVideo()
, and stopWritingVideo()
, simply record the parameters for file writing operations, whereas the actual writing operations are postponed to the next call of exitFrame()
. The non-public method, _writeVideoFrame()
, creates or appends to a video file in a manner that should be familiar from our earlier scripts. (See the Reading/Writing a video file section.) However, in situations where the frame rate is unknown, we skip some frames at the start of the capture session so that we have time to build up an estimate of the frame rate.
Although our current implementation of CaptureManager
relies on VideoCapture
, we could make other implementations that do not use OpenCV for input. For example, we could make a subclass that was instantiated with a socket connection, whose byte stream could be parsed as a stream of images. Also, we could make a subclass that used a third-party camera library with different hardware support than what OpenCV provides. However, for Cameo, our current implementation is sufficient.
Abstracting a window and keyboard – managers.WindowManager
As we have seen, OpenCV provides functions that cause a window to be created, be destroyed, show an image, and process events. Rather than being methods of a window class, these functions require a window's name to pass as an argument. Since this interface is not object-oriented, it is inconsistent with OpenCV's general style. Also, it is unlikely to be compatible with other window or event handling interfaces that we might eventually want to use instead of OpenCV's.
For the sake of object-orientation and adaptability, we abstract this functionality into a WindowManager
class with the createWindow()
, destroyWindow()
, show()
, and processEvents()
methods. As a property, a WindowManager
class has a function object called keypressCallback
, which (if not None
) is called from processEvents()
in response to any key press. The keypressCallback
object must take a single argument, an ASCII keycode.
Let's add the following implementation of WindowManager
to managers.py
:
class WindowManager(object): def __init__(self, windowName, keypressCallback = None): self.keypressCallback = keypressCallback self._windowName = windowName self._isWindowCreated = False @property def isWindowCreated(self): return self._isWindowCreated def createWindow (self): cv2.namedWindow(self._windowName) self._isWindowCreated = True def show(self, frame): cv2.imshow(self._windowName, frame) def destroyWindow (self): cv2.destroyWindow(self._windowName) self._isWindowCreated = False def processEvents (self): keycode = cv2.waitKey(1) if self.keypressCallback is not None and keycode != -1: # Discard any non-ASCII info encoded by GTK. keycode &= 0xFF self.keypressCallback(keycode)
Our current implementation only supports keyboard events, which will be sufficient for Cameo. However, we could modify WindowManager
to support mouse events too. For example, the class's interface could be expanded to include a mouseCallback
property (and optional constructor argument) but could otherwise remain the same. With some event framework other than OpenCV's, we could support additional event types in the same way, by adding callback properties.
Appendix A, Integrating with Pygame, shows a WindowManager
subclass that is implemented with Pygame's window handling and event framework instead of OpenCV's. This implementation improves on the base WindowManager
class by properly handling quit events—for example, when the user clicks on the window's close button. Potentially, many other event types can be handled via Pygame too.
Applying everything – cameo.Cameo
Our application is represented by a class, Cameo
, with two methods: run()
and onKeypress()
. On initialization, a Cameo
class creates a WindowManager
class with onKeypress()
as a callback, as well as a CaptureManager
class using a camera and the WindowManager
class. When run()
is called, the application executes a main loop in which frames and events are processed. As a result of event processing, onKeypress()
may be called. The Space bar causes a screenshot to be taken, Tab causes a screencast (a video recording) to start/stop, and Esc causes the application to quit.
In the same directory as managers.py
, let's create a file called cameo.py
containing the following implementation of Cameo
:
import cv2 from managers import WindowManager, CaptureManager class Cameo(object): def __init__(self): self._windowManager = WindowManager('Cameo', self.onKeypress) self._captureManager = CaptureManager( cv2.VideoCapture(0), self._windowManager, True) def run(self): """Run the main loop.""" self._windowManager.createWindow() while self._windowManager.isWindowCreated: self._captureManager.enterFrame() frame = self._captureManager.frame # TODO: Filter the frame (Chapter 3). self._captureManager.exitFrame() self._windowManager.processEvents() def onKeypress (self, keycode): """Handle a keypress. space -> Take a screenshot. tab -> Start/stop recording a screencast. escape -> Quit. """ if keycode == 32: # space self._captureManager.writeImage('screenshot.png') elif keycode == 9: # tab if not self._captureManager.isWritingVideo: self._captureManager.startWritingVideo( 'screencast.avi') else: self._captureManager.stopWritingVideo() elif keycode == 27: # escape self._windowManager.destroyWindow() if __name__=="__main__": Cameo().run()
When running the application, note that the live camera feed is mirrored, while screenshots and screencasts are not. This is the intended behavior, as we pass True
for shouldMirrorPreview
when initializing the CaptureManager
class.
So far, we do not manipulate the frames in any way except to mirror them for preview. We will start to add more interesting effects in Chapter 3, Filtering Images.
Summary
By now, we should have an application that displays a camera feed, listens for keyboard input, and (on command) records a screenshot or screencast. We are ready to extend the application by inserting some image-filtering code (Chapter 3, Filtering Images) between the start and end of each frame. Optionally, we are also ready to integrate other camera drivers or other application frameworks (Appendix A, Integrating with Pygame), besides the ones supported by OpenCV.
- Java范例大全
- OpenCV實例精解
- Learning Laravel 4 Application Development
- Interactive Applications Using Matplotlib
- 人人都懂設計模式:從生活中領悟設計模式(Python實現)
- jQuery開發基礎教程
- 青少年Python編程入門
- C語言程序設計實驗指導 (第2版)
- TMS320LF240x芯片原理、設計及應用
- SQL Server 2008中文版項目教程(第3版)
- 邊玩邊學Scratch3.0少兒趣味編程
- Python Programming for Arduino
- INSTANT JQuery Flot Visual Data Analysis
- 3ds Max 2018從入門到精通
- AngularJS UI Development