- Mastering Concurrency in Python
- Quan Nguyen
- 475字
- 2021-06-10 19:24:05
Starting from managing files
As an experienced Python user, you have probably seen the with statement being used to open and read external files inside Python programs. Looking at this problem at a lower level, the operation of opening an external file in Python will consume a resource—in this case, a file descriptor—and your operating system will set a limit on this resource. This means that there is an upper limit on how many files a single process running on your system can open simultaneously.
Let's consider a quick example to illustrate this point further. Let's take a look at the Chapter04/example1.py file, as shown in the following code:
# Chapter04/example1.py
n_files = 10
files = []
for i in range(n_files):
files.append(open('output1/sample%i.txt' % i, 'w'))
This quick program simply creates 10 text files inside the output1 folder: sample0.txt, sample1.txt, ..., sample9.txt. What might be of more interest to us is the fact that the files were opened inside the for loop but were not closed—this is a bad practice in programming that we will discuss later. Now, let's say we wanted to reassign the n_files variable to a large number—say 10,000—as shown in the following code:
# Chapter4/example1.py
n_files = 10000
files = []
# method 1
for i in range(n_files):
files.append(open('output1/sample%i.txt' % i, 'w'))
We would get an error similar to the following:
> python example1.py
Traceback (most recent call last):
File "example1.py", line 7, in <module>
OSError: [Errno 24] Too many open files: 'output1/sample253.txt'
Looking closely at the error message, we can see that my laptop can only handle 253 opened files simultaneously (as a side note, if you are working on a UNIX-like system, running ulimit -n will give you the number of files that your system can handle). More generally, this situation arose from what is known as file descriptor leakage. When Python opens a file inside a program, that opened file is essentially represented by an integer. This integer acts as a reference point that the program can use in order to have access to that file, while not giving the program complete control over the underlying file itself.
By opening too many files at the same time, our program assigned too many file descriptors to manage the open files, hence the error message. File descriptor leakage can lead to a number of difficult problems—especially in concurrent and parallel programming—namely, unauthorized I/O operations on open files. The solution to this is to simply close opened files in a coordinated manner. Let's look at our Chapter04/example1.py file in the second method. In the for loop, we would do the following:
# Chapter04/example1.py
n_files = 1000
files = []
# method 2
for i in range(n_files):
f = open('output1/sample%i.txt' % i, 'w')
files.append(f)
f.close()
- 造個小程序:與微信一起干件正經(jīng)事兒
- Vue.js 2 and Bootstrap 4 Web Development
- 網(wǎng)頁設計與制作教程(HTML+CSS+JavaScript)(第2版)
- 薛定宇教授大講堂(卷Ⅳ):MATLAB最優(yōu)化計算
- Hadoop+Spark大數(shù)據(jù)分析實戰(zhàn)
- 用Python實現(xiàn)深度學習框架
- PhpStorm Cookbook
- WebRTC技術詳解:從0到1構建多人視頻會議系統(tǒng)
- Learning OpenCV 3 Computer Vision with Python(Second Edition)
- Vue.js光速入門及企業(yè)項目開發(fā)實戰(zhàn)
- WebStorm Essentials
- Python面試通關寶典
- React and React Native
- Game Programming using Qt 5 Beginner's Guide
- 零基礎學C++