官术网_书友最值得收藏!

Using wildcards and regexes

As we saw in the previous section, there was this new concept of recursive functions and the introduction of wildcards. This section will extend upon those same fundamental primitives to create more advanced searches using regexes and globbing.

It will also extend them with a number of built-in Bash features, and some one-liners (nifty tricks) to enhance our searches. In short:

  • A wildcard can be: *, {*.ooh,*.ahh}, /home/*/path/*.txt, [0-10], [!a], ?, [a,p] m
  • A regex can be: $, ^, *, [], [!], | (be careful to escape this)

Globbing basically refers to a far more computer-eccentric term, which can be simply described in layman terms as extended pattern matching. Wildcards are the symbols used to describe patterns, and regex is short for regular expression, which are terms used to describe the pattern that is to match a series of data.

Globbing in Bash is powerful, but likely not the best place to perform even more advanced or intricate pattern matching. In these cases, Python or another language/tool might be more appropriate.

As we can imagine, globbing and pattern matching are really useful, but they cannot be used by every utility or application. Usually, though, they can be used at the command line with utilities such as grep. For example:

$ ls -l | grep '[[:lower:]][[:digit:]]' # Notice no result
$ touch z0.test
$ touch a1.test
$ touch A2.test
$ ls -l | grep '[[:lower:]][[:digit:]]'
-rw-rw-r-- 1 rbrash rbrash 0 Nov 15 11:31 z0.test
-rw-rw-r-- 1 rbrash rbrash 0 Nov 15 11:31 a1.test

Using the ls command, which is piped into grep with a regex, we can see that after we touch three files and re-run the command that the regex allowed us to correctly filter the output for files starting with a lowercase character, which are followed by a single digit. 

If we wanted to further enhance grep (or another command), we could use any of the following:

  • [:alpha:]: Alphabetic (case-insensitive) 
  • [:lower:]: Lowercase printable characters
  • [:upper:]: Uppercase printable characters
  • [:digit:]: Numbers in decimal 0 to 9
  • [:alnum:]: Alphanumeric (all digits and alphabetic characters)
  • [:space:]: White space meaning spaces, tabs, and newlines
  • [:graph:]: Printable characters excluding spaces
  • [:print:]: Printable characters including spaces
  • [:punct:]: Punctuation (for example, a period)
  • [:cntrl:]: Control characters (non-printable characters like when a signal is generated when you use Ctrl + C)
  • [:xdigit:]: Hexadecimal characters
主站蜘蛛池模板: 洪泽县| 云和县| 达尔| 葫芦岛市| 巴林左旗| 道真| 华亭县| 商水县| 托克托县| 澄江县| 望江县| 潞城市| 马边| 当阳市| 岚皋县| 遵义县| 新巴尔虎右旗| 阿鲁科尔沁旗| 常德市| 疏附县| 老河口市| 桐梓县| 红河县| 贡觉县| 平度市| 巴青县| 沈阳市| 宣武区| 都昌县| 淮滨县| 来安县| 竹山县| 昭平县| 乐山市| 和静县| 瓮安县| 仪陇县| 宜都市| 柘城县| 龙江县| 连城县|