書名： Mastering the C++17 STL
作者名： Arthur O'Dwyer
本章字?jǐn)?shù)： 766字
更新時間： 2021-07-08 10:20:23

Shunting data with std::copy

We've just seen our first few two-range algorithms. The <algorithm> header is full of two-range algorithms and their siblings, the one-and-a-half-range algorithms. What's the simplest possible such algorithm?

A reasonable answer would be: "Copy each data element from the first range into the second range." Indeed, the STL provides that algorithm, under the name std::copy:

    template<class InIt, class OutIt>
    OutIt copy(InIt first1, InIt last1, OutIt destination)
    {
      while (first1 != last1) {
        *destination = *first1;
        ++first1;
        ++destination;
      }
      return destination;
    }

Notice that this is a one-and-a-half-range algorithm. The standard library actually does not provide a two-range version of std::copy; the assumption is that if you are actually trying to write into a buffer, then you must have checked its size already, so checking "are we at the end of the buffer yet" inside the loop would be both redundant and inefficient.

Now I can practically hear you exclaiming: "Horrors! This is the same crude logic that brought us strcpy, sprintf, and gets! This is an invitation to buffer overflows!" Well, if you were to exclaim thusly, you'd be right about the bad behavior of gets--in fact, the gets function has been officially removed from the C++17 standard library. And you'd be right about sprintf--anyone who needs that functionality is better of using the range-checked version snprintf, which is analogous to a "two-range algorithm" in this context. But about strcpy I'd disagree. With gets it is impossible to know the correct size for the output buffer; with sprintf it is difficult; but with strcpy it is trivial: you just measure the strlen of the input buffer and that's your answer. Likewise with std::copy, the relationship between "input elements consumed" and "output elements produced" is exactly one-to-one, so sizing the output buffer doesn't present a technical challenge.

Notice that the parameter we called destination is an output iterator. This means that we can use std::copy, not merely to shunt data around in memory, but even to feed data to an arbitrary "sink" function. For example:

    class putc_iterator : public boost::iterator_facade<
      putc_iterator, // T
      const putc_iterator, // value_type
      std::output_iterator_tag
      >
    {
      friend class boost::iterator_core_access;

       auto& dereference() const { return *this; }
       void increment() {}
       bool equal(const putc_iterator&) const { return false; }
       public:
       // This iterator is its own proxy object!
       void operator= (char ch) const { putc(ch, stdout); }
    };

    void test()
    {
      std::string s = "hello";
      std::copy(s.begin(), s.end(), putc_iterator{});
    }

You may find it instructive to compare this version of our putc_iterator to the version from Chapter 2, Iterators and Ranges; this version is using boost::iterator_facade as introduced at the end of Chapter 2, Iterators and Ranges and also using a common trick to return *this instead of a new proxy object.

Now we can use the flexibility of destination to solve our concerns about buffer overflow! Suppose that, instead of writing into a fixed-size array, we were to write into a resizable std::vector (see Chapter 4, The Container Zoo). Then "writing an element" corresponds to "pushing an element back" on the vector. So we could write an output iterator very similar to putc_iterator, that would push_back instead of putc, and then we'd have an overflow-proof way of filling up a vector. Indeed, the standard library provides just such an output iterator, in the <iterator> header:

    namespace std {
      template<class Container>
      class back_insert_iterator {
        using CtrValueType = typename Container::value_type;
        Container *c;
      public:
        using iterator_category = output_iterator_tag;
        using difference_type = void;
        using value_type = void;
        using pointer = void;
        using reference = void;

        explicit back_insert_iterator(Container& ctr) : c(&ctr) {}

        auto& operator*() { return *this; }
        auto& operator++() { return *this; }
        auto& operator++(int) { return *this; }

        auto& operator= (const CtrValueType& v) {
            c->push_back(v);
            return *this;
        }
        auto& operator= (CtrValueType&& v) {
            c->push_back(std::move(v));
            return *this;
        }
      };
  
      template<class Container>
      auto back_inserter(Container& c)
      {
         return back_insert_iterator<Container>(c);
      }
    }

    void test()
    {
      std::string s = "hello";
      std::vector<char> dest;
      std::copy(s.begin(), s.end(), std::back_inserter(dest));
      assert(dest.size() == 5);
    }

The function call std::back_inserter(dest) simply returns a back_insert_iterator object. In C++17, we could rely on template type deduction for constructors and write the body of that function as simply return std::back_insert_iterator(dest); or dispense with the function entirely and just write std::back_insert_iterator(dest) directly in our code--where C++14 code would have to "make do" with std::back_inserter(dest). However, why would we want all that extra typing? The name back_inserter was deliberately chosen to be easy to remember, since it's the one that we were expected to use most often. Although C++17 allows us to write std::pair in place of std::make_pair, and std::tuple in place of std::make_tuple, it would be silly to write the cumbersome std::back_insert_iterator in place of std::back_inserter. You should prefer std::back_inserter(dest) even in C++17.

官术网_书友最值得收藏!

Mastering the C++17 STL

Shunting data with std::copy