33774财神网站香港

書名： Learning Boost C++ Libraries
作者名： Arindam Mukherjee
本章字數： 1046字
更新時間： 2021-07-16 20:49:02

Handling command-line arguments

Command-line arguments, like API parameters, are the remote control buttons that help you tune the behavior of commands to your advantage. A well-designed set of command-line options is behind much of the power of a command. In this section, we will see how the Boost.Program_Options library helps you add support for a rich and standardized set of command-line options to your own programs.

Designing command-line options

C provides the most primitive abstraction for the command line of your program. Using the two arguments passed to the main function—the number of arguments (argc) and the list of arguments (argv)—you can find out about each and every argument passed to the program and their relative ordering. The following program prints argv[0], which is the path to the program itself with which the program was invoked. When run with a set of command-line arguments, the program also prints each argument on a separate line.

Most programs need to add more logic and validation to verify and interpret command-line arguments and hence, a more elaborate framework is needed to handle command-line arguments:

1 int main(int argc, char *argv[])
2 {
3   std::cout << "Program name: " << argv[0] << '\n';
4
5   for (int i = 1; i < argc; ++i) {
6     std::cout << "argv[" << i << "]: " << argv[i] << '\n';
7   }
8 }

The diff command – a case study

Programs usually document a set of command-line options and switches that modify their behavior. Let us take a look at the example of the diff command in Unix. The diff command is run like this:

$ diff file1 file2

It prints the difference between the content of the two files. There are several ways in which you can choose to print the differences. For each different chunk found, you may choose to print a few additional lines surrounding the difference to get a better understanding of the context in which the differing part appears. These surrounding lines or "context" do not differ between the two files. To do this, you can use one of the following alternatives:

$ diff -U 5 file1 file2
$ diff --unified=5 file1 file2

Here, you choose to print five additional lines of context. You can also choose the default of three by specifying:

$ diff --unified file1 file2

In the preceding examples, -U or --unified are examples of command-line options. The former is a short option consisting of a single leading hyphen and a single letter (-U). The latter is a long option with two leading hyphens and a multi-character option name (--unified).

The number 5 is an option value; an argument to the option (-U or --unified) preceding it. The option value is separated from a preceding short option by space, but from a preceding long option by an equals sign (=).

If you are "diffing" two C or C++ source files, you can get more useful information using a command-line switch or flag -p. A switch is an option that does not take an option value as an argument. Using this switch, you can print the name of the C or C++ function in the context of which a particular difference is detected. There is no long option corresponding to it.

The diff command is a very powerful tool with which you can find differences in the content of files in full directories. When diffing two directories, if a file exists in one but not the other, diff ignores this file by default. However, you may want to instead see the contents of the new file. To do this, you will use the -N or --new-file switch. If we want to now run our diff command on two directories of C++ source code to identify changes, we can do it in this way:

$ diff -pN –unified=5 old_source_dir new_source_dir

You don't have to be eagle-eyed to notice that we used an option called -pN. This is actually not a single option but two switches, (-p) and (-N), collapsed together.

Certain patterns or conventions should be evident from this case-study:

Starting short options with single hyphens
Starting long options with double hyphens
Separating short options and option-values with space
Separating long options and option-value with equals
Collapsing short switches together

These are de facto standardized conventions on highly POSIX-compliant systems, such as Linux. It is, however, by no means the only convention followed. Windows command lines often use a leading forward slash (/) in place of a hyphen. They often do not distinguish between short and long options, and sometimes use a colon (:) in place of an equals sign to separate an option and its option value. Java commands as well as commands in several older Unix systems use a single leading hyphen for both short and long options. Some of them use a space for separating an option and option-value irrespective of whether it is a short option or a long one. How can you take care of so many complex rules that vary from platform to platform while parsing your command line? This is where Boost Program Options library makes a big difference.

Using Boost.Program_Options

The Boost Program Options library provides you with a declarative way of parsing command lines. You can specify the set of options and switches and the type of option-values for each option that your program supports. You can also specify which set of conventions you want to support for your command line. You can then feed all of this information to the library functions that parse and validate the command line and extract all the command-line data into a dictionary-like structure from which you can access individual bits of data. We will now write some code to model the previously mentioned options for the diff command:

Listing 2.12a: Using Boost Program Options

 1 #include <boost/program_options.hpp>
 2
 3 namespace po = boost::program_options;
 4 namespace postyle = boost::program_options::command_line_style;
 5 
 6 int main(int argc, char *argv[])
 7 {
 8   po::options_description desc("Options");
 9   desc.add_options()
10      ("unified,U", po::value<unsigned int>()->default_value(3),
11             "Print in unified form with specified number of "
12             "lines from the surrounding context")
13      (",p", "Print names of C functions "
14             " containing the difference")
15      (",N", "When comparing two directories, if a file exists in"
16             " only one directory, assume it to be present but "
17             " blank in the other directory")
18      ("help,h", "Print this help message");

In the preceding code snippet, we declare the structure of the command line using an options_description object. Successive options are declared using an overloaded function call operator() in the object returned by the add_options. You can cascade calls to this operator in the same way that you can print multiple values by cascading calls to the insertion operator (<<) on std::cout. This makes for a highly readable specification of the options.

We declare the --unified or -U option specifying both the long and short options in a single string, separated by a comma (line 10). The second argument indicates that we expect a numeric argument, and the default value will be taken as 3 if the argument is not specified on the command line. The third field is the description of the option and will be used to generate a documentation string.

We declare the short options -p and -N (lines 13 and 15), but as they do not have corresponding long options, they are introduced with a comma followed by a short option (",p" and ",N"). They also do not take an option value, so we just provide their description.

So far so good. We will now complete the code example by parsing the command line and fetching the values. First, we will specify the styles to follow in Windows and Unix:

Listing 2.12b: Using Boost Program Options

19   int unix_style    = postyle::unix_style
20                      |postyle::short_allow_next;
21
22   int windows_style = postyle::allow_long
23                      |postyle::allow_short
24                      |postyle::allow_slash_for_short
25                      |postyle::allow_slash_for_long
26                      |postyle::case_insensitive
27                      |postyle::short_allow_next
28                      |postyle::long_allow_next;

The preceding code highlights some important differences between Windows and Unix conventions:

A more or less standardized Unix style is available precanned and called, unix_style. However, we have to build the Windows style ourselves.
The short_allow_next flag allows you to separate a short option and its option value with a space; this is used on both Windows and Unix.
The allows_slash_for_short and allow_slash_for_long flags allow the options to be preceded by forward slashes; a common practice on Windows.
The case_insensitive flag is appropriate for Windows where the usual practice is to have case insensitive commands and options.
The long_allow_next flag on Windows allows long options and option values to be separated by a space instead of equals.

Now, let us see how we can parse a conforming command line using all of this information. To do this, we will declare an object of type variables_map to read all the data and then parse the command line:

Listing 2.12c: Using Boost Program Options

29   po::variables_map vm;
30   try {
31     po::store(
32       po::command_line_parser(argc, argv)
33          .options(desc)
34          .style(unix_style)  // or windows_style
35          .run(), vm);
36
37     po::notify(vm); 
38
39     if (argc == 1 || vm.count("help")) {
40       std::cout << "USAGE: " << argv[0] << '\n'
41                 << desc << '\n';
42       return 0;
43     }
44   } catch (po::error& poe) {
45     std::cerr << poe.what() << '\n'
46               << "USAGE: " << argv[0] << '\n' << desc << '\n';
47     return EXIT_FAILURE;
48   }

We create a command-line parser using the command_line_parser function (line 32). We call the options member function on the returned parser to specify the parsing rules encoded in desc (line 33). We chain further member function calls, to the style member function of the parser for specifying the expected style (line 34), and to the run member function to actually perform the parsing. The call to run returns a data structure containing the data parsed from the command-line. The call to boost::program_options::store stores the parsed data from this data structure inside the variables_map object vm (lines 31-35). Finally, we check whether the program was invoked without arguments or with the help option, and print the help string (line 39). Streaming the option_description instance desc to an ostream prints a help string, that is automatically generated based on the command-line rules encoded in desc (line 41). All this is encapsulated in a try-catch block to trap any command line parsing errors thrown by the call to run (line 35). In the event of such an error, the error details are printed (line 45) along with the usage details (line 46).

If you notice, we call a function called notify(…) on line 37. In more advanced uses, we may choose to use values that are read from the command line to set variables or object members, or perform other post-processing actions. Such actions can be specified for each option while declaring option descriptions, but these actions are only initiated by the call to notify. As a matter of consistency, do not drop the call to notify.

We can now extract the values passed via the command line:

Listing 2.12d: Using Boost Program Options

49   unsigned int context = 0;
50   if (vm.count("unified")) {
51     context = vm["unified"].as<unsigned int>();
52   }
53
54   bool print_cfunc = (vm.count("p") > 0);

Parsing positional parameters

If you were observant, you would have noticed that we did nothing to read the two file names; the two main operands of the diff command. We did this for simplicity, and we will fix this now. We run the diff command like this:

$ diff -pN --unified=5 old_source_dir new_source_dir

The old_source_dir and new_source_dir arguments are called positional parameters. They are not options or switches, nor are they arguments to any options. In order to handle them, we will have to use a couple of new tricks. First of all, we must tell the parser the number and type of these parameters that we expect. Second, we must tell the parser that these are positional parameters. Here is the code snippet:

 1 std::string file1, file2;
 2 po::options_description posparams("Positional params");
 3 posparams.add_options()
 4         ("file1", po::value<std::string>(&file1)->required(), "")
 5         ("file2", po::value<std::string>(&file2)->required(), "");
 6 desc.add(posparams);
 7
 8
 9 po::positional_options_description posOpts;
10 posOpts.add("file1", 1);  // second param == 1 indicates that
11 posOpts.add("file2", 1);  //  we expect only one arg each
12
13 po::store(po::command_line_parser(argc, argv)14                 .options(desc)
15                 .positional(posOpts)
16                 .style(windows_style)
17                 .run(), vm);

In the preceding code, we set up a second options description object called posparams to identify the positional parameters. We add options with names "file1" and "file2", and indicate that these parameters are mandatory, using the required() member function of the value parameter (lines 4 and 5). We also specify two string variables file1 and file2 to store the positional parameters. All of this is added to the main options description object desc (line 6). For the parser to not look for actual options called "--file1" and "--file2", we must tell the parser that these are positional parameters. This is done by defining a positional_options_description object (line 9) and adding the options that should be treated as positional options (lines 10 and 11). The second parameter in the call to add(…) specifies how many positional parameters should be considered for that option. Since we want one file name, each for options file1 and file2, we specify 1 in both the calls. Positional parameters on the command line are interpreted according to the order in which they are added to the positional options description. Thus, in this case, the first positional parameter will be treated as file1, and the second parameter will be treated as file2.

Multiple option values

In some cases, a single option may take multiple option values. For example, during compilation, you will use the -I option multiple times to specify multiple directories. To parse such options and their option values, you can specify the target type as a vector, as shown in the following snippet:

 1 po::options_description desc("Options");
 2 desc.add_option()
 3      ("include,I", po::value<std::vector<std::string> >(),
 4       "Include files.")
 5      (…);

This will work on an invocation like this:

$ c++ source.cpp –o target -I path1 -I path2 -I path3

In some cases, however, you might want to specify multiple option values, but you specify the option itself only once. Let us say that you are running a command to discover assets (local storage, NICs, HBAs, and so on) connected to each of a set of servers. You can have a command like this:

$ discover_assets --servers svr1 svr2 svr3 --uid user

In this case, to model the --server option, you would need to use the multitoken() directive as shown here:

 1 po::options_description desc("Options");
 2 desc.add_option()
 3      ("servers,S", 
 4       po::value<std::vector<std::string> >()->multitoken(),
 5       "List of hosts or IPs.")
 6      ("uid,U", po::value<std::string>, "User name");

You can retrieve vector-valued parameters through the variable map like this:

1 std::vector<std::string> servers = vm["servers"];

Alternatively, you can use variable hooks at the time of option definition like this:

1 std::vector<std::string> servers;
2 desc.add_option()
3      ("servers,S",
4       po::value<std::vector<std::string> >(&servers
5          ->multitoken(),
6       "List of hosts or IPs.")…;

Make sure that you don't forget to call notify after parsing the command line.

Tip

Trying to support positional parameters and options with multi-tokens together in the same command can confuse the parser and should be generally avoided.

The Program Options library uses Boost Any for its implementation. For the Program Options library to work correctly, you must not disable the generation of RTTI for your programs.

官术网_书友最值得收藏!

Learning Boost C++ Libraries