- Rust Standard Library Cookbook
- Jan Nils Ferner Daniel Durante
- 472字
- 2021-08-27 19:45:06
How it works...
You can construct a regex object by calling Regex::new() with a valid regex string[7]. Most of the time, you will want to pass a raw string in the form of r"...". Raw means that all symbols in the string are taken at literal value without being escaped. This is important because of the backslash (\) character that is used in regex to represent a couple of important concepts, such as digits(\d) or whitespace (\s). However, Rust already uses the backslash to escape special non-printable symbols, such as the newline (\n) or the tab (\t)[23]. If we wanted to use a backslash in a normal string, we would have to escape it by repeating it ( \\). Or the regex on line [14] would have to be rewritten as:
"(\\d{2}).(\\d{2}).(\\d{4})"
Worse yet, if we wanted to match for the backslash itself, we would have to escape it as well because of regex. With normal strings, we would have to quadruple-escape it! ( \\\\)
We can save ourselves the headache of missing readability and confusion by using raw strings and write our regex normally. In fact, it is considered good style to use raw strings in every regex, even when it doesn't have any backslashes [33]. This is a help for your future self if you notice down the line that you actually would like to use a feature that requires a backslash.
We can iterate over the results of our regex [18]. The object we get on every match is a collection of our capture groups. Keep in mind that the zeroeth index is always the entire capture [19]. The first index is then the string from our first capture group, the second index is the string of the second capture group, and so on. [20]. Unfortunately, we do not get a compile-time check on our index, so if we accessed &cap[4], our program would compile but then crash during runtime.
When replacing, we follow the same concept: $0 is the entire match, $1 the result of the first capture group, and so on. To make our life easier, we can give the capture groups names by starting them with ?P<somename>[29] and then use this name when replacing [31].
There are many flags that you can specify, in the form of (?flag), for fine-tuning, such as i, which makes the match case insensitive [33], or x, which ignores whitespace in the regex string. If you want to read up on them, visit their documentation (https://doc.rust-lang.org/regex/regex/index.html). Most of the time though, you can get the same result by using the RegexBuilder that is also in the regex crate [36]. Both of the rust_regex objects we generate in lines [33] and [36] are equivalent. While the second version is definitely more verbose, it is also way easier to understand at first glance.
- 全國大學(xué)生電子設(shè)計競賽訓(xùn)練教程
- 液晶和等離子體電視機原理與維修
- 現(xiàn)代雷達電子戰(zhàn)系統(tǒng)建模與仿真
- 信息光學(xué)原理
- 室內(nèi)分布系統(tǒng)規(guī)劃設(shè)計手冊
- MPLS在SDN時代的應(yīng)用
- Building a Pentesting Lab for Wireless Networks
- 通信網(wǎng)絡(luò)智能管道架構(gòu)與技術(shù)實現(xiàn)
- 開關(guān)電源維修從入門到精通(第3版)
- IPv6網(wǎng)絡(luò)部署實戰(zhàn)
- 室內(nèi)定位理論、方法和應(yīng)用
- 揭秘視頻號:像搭積木一樣拼出爆款短視頻
- 天地一體化信息網(wǎng)絡(luò)時間統(tǒng)一技術(shù)
- 3D顯示技術(shù)、標準與應(yīng)用
- Android應(yīng)用開發(fā)教程