官术网_书友最值得收藏!

How to do it...

Let's get started:

  1. Open a terminal, and an editor of your choice to create a new script.
  2. Inside of your script, add the following:
#!/bin/bash
STR1='123 is a number, ABC is alphabetic & aBC123 is alphanumeric.'

echo "-------------------------------------------------"
# Want to find all of the files beginning with an uppercase character and end with .pdf?
ls * | grep [[:upper:]]*.pdf

echo "-------------------------------------------------"
# Just all of the directories in your current directory?
ls -l [[:upper:]]*

echo "-------------------------------------------------"
# How about all of the files we created with an expansion using the { } brackets?
ls [:lower:].test .

echo "-------------------------------------------------"
# Files with a specific extension OR two?
echo ${STR1} > test.txt
ls *.{test,txt}

echo "-------------------------------------------------"
# How about looking for specific punctuation and output on the same line
echo "${STR1}" | grep -o [[:punct:]] | xargs echo

echo "-------------------------------------------------"
# How about using groups and single character wildcards (only 5 results)
ls | grep -E "([[:upper:]])([[:digit:]])?.test?" | tail -n 5

exit 0
  1. Now, execute the script and your console should be flooded with the output. Most importantly, let's look at the last five results. Notice the Z9(,) and Z9.test(3) among the results? This is the power of a regex at work! Okay, so we get that we can now create and search for a bunch of folders or files using variables, but can I use regexes to find things like variable parameters? Absolutely! See the next step.
  2. In the console, try the following:
$ grep -oP 'name="\K.*?(?=")' www.packtpub.com/index.html
  1. Again, in the console, try the following:
$ grep -P 'name=' www.packtpub.com/index.html
  1. Can we do better using commands like tr to remove new lines when finding instances of IF that may span multiple lines?
$ tr '\n' ' ' < www.packtpub.com/index.html | grep -o '<title>.*</title>' 
  1. Now, let's remove a bit more gunk from the screen using cut as a finale. Usually, the console is 80 characters wide, so let's add a line number and trim the output from grep:
$ grep -nP 'name=' www.packtpub.com/index.html | cut -c -80
Entire books have been dedicated to parsing data with regexes, but the key thing to note is that regexes are not always the best option for either performance or for markup languages like HTML. For example, when parsing HTML, it is best to use a parser that is aware of the language itself and any language-specific nuances. 
主站蜘蛛池模板: 平和县| 大竹县| 共和县| 拜城县| 扬中市| 思南县| 尉犁县| 肇东市| 阳山县| 博湖县| 天气| 福州市| 邵武市| 海门市| 肥东县| 盖州市| 西乌珠穆沁旗| 双柏县| 山西省| 绥宁县| 鱼台县| 南漳县| 闽清县| 礼泉县| 吉安县| 开鲁县| 麦盖提县| 荆门市| 论坛| 濮阳市| 馆陶县| 太和县| 高台县| 三门县| 武宁县| 夏河县| 盱眙县| 宜宾县| 中西区| 利津县| 毕节市|