- Bash Cookbook
- Ron Brash Ganesh Naik
- 544字
- 2021-07-23 19:17:40
Reading delimited data and altered output format
Every day, we open many files in many different formats. However, when thinking about large amounts of data, it is always a good practice to use standard formats. One of these is called Comma Separated Values, or CSVs, and it uses a comma (,) to separate elements or delimit on each row. This is particularly useful when you have large amounts of data or records, and that data will be used in a scripted fashion. For example, in every school semester, Bob, the system administrator, needs to create a series of new users and set their information. Bob also gets a standardized CSV (like in the following snippet) from the people in charge of attendance:
Rbrash,Ron,Brash,01/31/88,+11234567890,rbrash@acme.com,FakePassword9000
...
If Bob the administrator wishes to only read this information into an array and create users, it is relatively trivial for him to parse a CSV and create each record in one single scripted action. This allows Bob to focus his time and effort on other important issues such as troubleshooting end-user WiFi issues.
While this is a trivial example, these same files may be in different forms with delimiters (the , or $ sign, for example), different data, and different structures. However, each file works on the premise that each line is a record that needs to be read into some structure (whatever it may be) in SQL, Bash arrays, and so on:
Line1Itself: Header (optional and might not be present)
Line2ItselfIsOneREc:RecordDataWithDelimiters:endline (windows \r\n, in Linux \n)
....
In the preceding example of a pseudo CSV, there is a header, which may be optional (not present), and then several lines (each being a record). Now, for Bob to parse the CSV, he has many ways to do this, but he may use specialized functions that apply a strategy such as:
$ Loop through each item until done
for each line in CSV:
# Do something with the data such as create a user
# Loop through Next item if it exists
To read in the data, Bob or yourself may resort to using:
- For loops and arrays
- A form of iterator
- Manually walking through each line (not efficient)
Once any input data has been read in, the next step is to do something with the data itself. Is it to be transformed? Is it to be used immediately? Sanitized? Stored? Or converted to another format? Just like Bob, there are many things that can be performed using the data read in by the script.
In regards to outputting the data, we can also convert it to XML, JSON, or even insert it into a database as SQL. Unfortunately, this process requires being able to know at least two things: the format of the input data and the format of the output data.
This recipe aims at walking you through reading a trivial CSV and outputting the data into some arbitrary formats.
- Visual C++程序設計學習筆記
- Python自動化運維快速入門(第2版)
- INSTANT MinGW Starter
- 深入理解Java7:核心技術與最佳實踐
- Internet of Things with Intel Galileo
- Procedural Content Generation for C++ Game Development
- 深度學習原理與PyTorch實戰(第2版)
- Building Dynamics CRM 2015 Dashboards with Power BI
- 從Excel到Python數據分析:Pandas、xlwings、openpyxl、Matplotlib的交互與應用
- Oracle Data Guard 11gR2 Administration Beginner's Guide
- Beginning C# 7 Hands-On:The Core Language
- 自己動手構建編程語言:如何設計編譯器、解釋器和DSL
- Oracle SOA Suite 12c Administrator's Guide
- 走近SDN/NFV
- Learning Azure DocumentDB