- Bash Cookbook
- Ron Brash Ganesh Naik
- 159字
- 2021-07-23 19:17:39
Calculating statistics and reducing duplicates based on file contents
At first glance, calculating statistics based on the contents of a file might not be among the most interesting tasks one could accomplish with Bash scripting, however, it can be useful in several circumstances. Let's imagine that our program takes user input from several commands. We could calculate the length of the input to determine if it is too little or too much. Alternatively, we could also determine the size of a string to determine buffer sizes for a program written in another programming language (such as C/C++):
$ wc -c <<< "1234567890"
11 # Note there are 10 chars + a new line or carriage return \n
$ echo -n "1234567890" | wc -c
10
Better yet, what if we used a command called strings to output all printable ASCII strings to a file? The strings program will output every occurrence of a string—even if there are duplicates. Using other programs like sort and uniq (or a combination of the two), we can also sort the contents of a file and reduce duplicates if we wanted to calculate the number of unique lines within a file:
$ strings /bin/ls > unalteredoutput.txt
$ ls -lah unalteredoutput.txt
-rw-rw-r-- 1 rbrash rbrash 22K Nov 24 11:17 unalteredoutput.txt
$ strings /bin/ls | sort -u > sortedoutput.txt
$ ls -lah sortedoutput.txt
-rw-rw-r-- 1 rbrash rbrash 19K Nov 24 11:17 usortedoutput.txt
Now that we know a few basic premises of why we may need to perform some basic statistics, let's carry on with the recipe.
- DB2 V9權威指南
- Raspberry Pi for Python Programmers Cookbook(Second Edition)
- C# Programming Cookbook
- PHP 7底層設計與源碼實現
- C#編程入門指南(上下冊)
- DevOps Automation Cookbook
- Object-Oriented JavaScript(Second Edition)
- Learning Data Mining with R
- 智能搜索和推薦系統:原理、算法與應用
- Instant Zurb Foundation 4
- 軟件測試技術
- 分布式數據庫HBase案例教程
- Python機器學習技術:模型關系管理
- Java核心技術卷I基礎知識(原書第9版)
- 嵌入式Linux C語言程序設計基礎教程(微課版)