- Hands-On Artificial Intelligence for IoT
- Amita Kapoor
- 606字
- 2021-07-02 14:01:59
Working with CSV files with the csv module
In Python, the csv module provides classes and methods for reading and writing CSV files. The csv.reader method creates a reader object from which rows can be read iteratively. Each time a row is read from the file, the reader object returns a list of fields. For example, the following code demonstrates reading the data file and printing rows:
import csv
import os
with open(os.path.join(data_folder,data_file),newline='') as csvfile:
csvreader = csv.reader(csvfile)
for row in csvreader:
print(row)
The rows are printed as a list of field values:
['date', 'time', 'global_active_power', 'global_reactive_power', 'voltage', 'global_intensity', 'sub_metering_1', 'sub_metering_2', 'sub_metering_3'] ['0007-01-01', '00:00:00', '2.58', '0.136', '241.97', '10.6', '0', '0', '0'] ['0007-01-01', '00:01:00', '2.552', '0.1', '241.75', '10.4', '0', '0', '0'] ['0007-01-01', '00:02:00', '2.55', '0.1', '241.64', '10.4', '0', '0', '0']
The csv.writer method returns an object that can be used to write rows to a file. As an example, the following code writes the first 10 rows of the file to a temporary file and then prints it:
# read the file and write first ten rows
with open(os.path.join(data_folder, data_file), newline='') as csvfile, \
open(os.path.join(data_folder, 'temp.csv'), 'w', newline='') as tempfile:
csvreader = csv.reader(csvfile)
csvwriter = csv.writer(tempfile)
for row, i in zip(csvreader, range(10)):
csvwriter.writerow(row)
# read and print the newly written file
with open(os.path.join(data_folder, 'temp.csv'), newline='') as tempfile:
csvreader = csv.reader(tempfile)
for row in csvreader:
print(row)
The delimiter field and the quoting field characters are important attributes that you can set while creating reader and writer objects.
By default, the delimiter field is , and the other delimiters are specified with the delimiter argument to the reader or writer functions. For example, the following code saves the file with | as delimiter:
# read the file and write first ten rows with '|' delimiter
with open(os.path.join(data_folder, data_file), newline='') as csvfile, \
open(os.path.join(data_folder, 'temp.csv'), 'w', newline='') as tempfile:
csvreader = csv.reader(csvfile)
csvwriter = csv.writer(tempfile, delimiter='|')
for row, i in zip(csvreader, range(10)):
csvwriter.writerow(row)
# read and print the newly written file
with open(os.path.join(data_folder, 'temp.csv'), newline='') as tempfile:
csvreader = csv.reader(tempfile, delimiter='|')
for row in csvreader:
print(row)
If you do not specify a delimiter character when the file is read, the rows will be read as one field and printed as follows:
['0007-01-01|00:00:00|2.58|0.136|241.97|10.6|0|0|0']
quotechar specifies a character with which to surround fields. The quoting argument specifies what kind of fields can be surrounded with quotechar. The quoting argument can have one of the following values:
- csv.QUOTE_ALL: All the fields are quoted
- csv.QUOTE_MINIMAL: Only fields containing special characters are quoted
- csv.QUOTE_NONNUMERIC: All non-numeric fields are quoted
- csv.QUOTE_NONE: None of the fields are quoted
As an example, let's print the temp file first:
0007-01-01|00:00:00|2.58|0.136|241.97|10.6|0|0|0 0007-01-01|00:01:00|2.552|0.1|241.75|10.4|0|0|0 0007-01-01|00:02:00|2.55|0.1|241.64|10.4|0|0|0 0007-01-01|00:03:00|2.55|0.1|241.71|10.4|0|0|0 0007-01-01|00:04:00|2.554|0.1|241.98|10.4|0|0|0 0007-01-01|00:05:00|2.55|0.1|241.83|10.4|0|0|0 0007-01-01|00:06:00|2.534|0.096|241.07|10.4|0|0|0 0007-01-01|00:07:00|2.484|0|241.29|10.2|0|0|0 0007-01-01|00:08:00|2.468|0|241.23|10.2|0|0|0
Now let's save it with all fields quoted:
# read the file and write first ten rows with '|' delimiter, all quoting and * as a quote charachetr.
with open(os.path.join(data_folder, data_file), newline='') as csvfile, \
open('temp.csv', 'w', newline='') as tempfile:
csvreader = csv.reader(csvfile)
csvwriter = csv.writer(tempfile, delimiter='|', quotechar='*',quoting=csv.QUOTE_ALL)
for row, i in zip(csvreader, range(10)):
csvwriter.writerow(row)
The file gets saved with the specified quote character:
*0007-01-01*|*00:00:00*|*2.58*|*0.136*|*241.97*|*10.6*|*0*|*0*|*0* *0007-01-01*|*00:01:00*|*2.552*|*0.1*|*241.75*|*10.4*|*0*|*0*|*0* *0007-01-01*|*00:02:00*|*2.55*|*0.1*|*241.64*|*10.4*|*0*|*0*|*0* *0007-01-01*|*00:03:00*|*2.55*|*0.1*|*241.71*|*10.4*|*0*|*0*|*0* *0007-01-01*|*00:04:00*|*2.554*|*0.1*|*241.98*|*10.4*|*0*|*0*|*0* *0007-01-01*|*00:05:00*|*2.55*|*0.1*|*241.83*|*10.4*|*0*|*0*|*0* *0007-01-01*|*00:06:00*|*2.534*|*0.096*|*241.07*|*10.4*|*0*|*0*|*0* *0007-01-01*|*00:07:00*|*2.484*|*0*|*241.29*|*10.2*|*0*|*0*|*0* *0007-01-01*|*00:08:00*|*2.468*|*0*|*241.23*|*10.2*|*0*|*0*|*0*
Remember to read the file with the same arguments; otherwise, the * quote character will be treated as part of the field values and printed as follows:
['*0007-01-01*', '*00:00:00*', '*2.58*', '*0.136*', '*241.97*', '*10.6*', '*0*', '*0*', '*0*']
Using the correct arguments with the reader object prints the following:
['0007-01-01', '00:00:00', '2.58', '0.136', '241.97', '10.6', '0', '0', '0']
Now let's see how we can read CSV files with pandas, another popular Python library.
- 嵌入式系統(tǒng)及其開發(fā)應(yīng)用
- Getting Started with Clickteam Fusion
- Mobile DevOps
- 計(jì)算機(jī)網(wǎng)絡(luò)應(yīng)用基礎(chǔ)
- 永磁同步電動(dòng)機(jī)變頻調(diào)速系統(tǒng)及其控制(第2版)
- 分?jǐn)?shù)階系統(tǒng)分析與控制研究
- MCGS嵌入版組態(tài)軟件應(yīng)用教程
- 基于神經(jīng)網(wǎng)絡(luò)的監(jiān)督和半監(jiān)督學(xué)習(xí)方法與遙感圖像智能解譯
- Linux系統(tǒng)下C程序開發(fā)詳解
- Mastering Ansible(Second Edition)
- 大數(shù)據(jù)素質(zhì)讀本
- Java求職寶典
- Keras Reinforcement Learning Projects
- 中老年人學(xué)數(shù)碼照片后期處理
- iLike職場(chǎng)大學(xué)生就業(yè)指導(dǎo):C和C++方向