- Perl 6 Deep Dive
- Andrew Shitov
- 606字
- 2021-07-03 00:05:45
Whitespaces and unspaces
As we just saw, a Perl 6 program can intensively use the Unicode characters outside of the conventional ASCII set. This also applies to the whitespaces. Whitespaces are those gaps between the elements of the program, that are traditionally represented by spaces (ASCII code 0x20), tabs (0x09), and newlines (a single line feed character 0x0A in Unix and a series of two characters, carriage return 0x0D, and line feed 0x0A in Windows). Perl 6 extends the concept of whitespaces and accepts Unicode whitespace in every place of the code where a regular space is allowed. Be careful when you work with an existing code that, for some reason, is filled with Unicode characters.
A whitespace character set in Perl 6 includes characters that have one of the following Unicode properties:
- Zs: Separator, Space
- Zl: Separator, Line
- Zp: Separator, Paragraph
You can find a complete list of characters from the listed categories at https://en.wikipedia.org/wiki/Whitespace_character. Among them are a regular space, vertical and horizontal tabs, newlines, linefeeds, non-breaking space, and thin space.
On a bigger scale, Perl 6 allows the program to be formatted as the programmer wants it. On the other hand, there are a few rules regarding where spaces can occur, which you should follow when writing a Perl 6 program.
If the language forbids having a whitespace at a particular place in the code, but you desire to format the program to make it more spacious, you can add the so-called unspace. This is a sequence started with a backslash placed immediately after the previous piece of code and followed by one or more whitespace characters. It resembles the backslash at the end of a Unix command-line instruction that continues on the next line.
Let's take look at the most important cases where the language rules regarding the spaces are strict and may conflict with your habits.
The first example is a function call. In Perl 6, parentheses are not required around the arguments of a function, but as soon as you use them, you cannot have a space between the name of the function and the opening parenthesis. Examine the following three calls:
say add 4, 5; # OK, no parentheses say add(4, 5); # OK, no space say add (6, 7); # Error
The first two lines are correct, while the last one produces a compile time error, as shown here:
Too few positionals passed; expected 2 arguments but got 1
The error message may sound misleading, but remember that in Perl 6, you can pass arrays to the function. In this case, the compiler cannot guarantee that it understood the intention of the programmer correctly. The add (6, 7) construction may be interpreted as calling a function with a single argument that is a two-element array—(6, 7).
If you still prefer visual separation of the argument list and the function name, place an unspace between them as follows:
say add\ (6, 7);
Now it is compiling with no complaints. Newlines inside the unspace are also allowed; consider the following example:
say add\ (6, 7);
It is also possible to format the code differently, leaving the opening parenthesis on the same line with the function name, as follows:
say add( 6, 7 );
This approach may be handy when you need to pass many arguments and, for example, comment on each of them:
say add( 6, # first argument 7 # second argument
);
We will talk more about functions in Chapter 6, Subroutines. However, for now, let's return to the methods of organizing the source code.