官术网_书友最值得收藏!

Shunting data with std::copy

We've just seen our first few two-range algorithms. The <algorithm> header is full of two-range algorithms and their siblings, the one-and-a-half-range algorithms. What's the simplest possible such algorithm?

A reasonable answer would be: "Copy each data element from the first range into the second range." Indeed, the STL provides that algorithm, under the name std::copy:

    template<class InIt, class OutIt>
OutIt copy(InIt first1, InIt last1, OutIt destination)
{
while (first1 != last1) {
*destination = *first1;
++first1;
++destination;
}
return destination;
}

Notice that this is a one-and-a-half-range algorithm. The standard library actually does not provide a two-range version of std::copy; the assumption is that if you are actually trying to write into a buffer, then you must have checked its size already, so checking "are we at the end of the buffer yet" inside the loop would be both redundant and inefficient.

Now I can practically hear you exclaiming: "Horrors! This is the same crude logic that brought us strcpy, sprintf, and gets! This is an invitation to buffer overflows!" Well, if you were to exclaim thusly, you'd be right about the bad behavior of gets--in fact, the gets function has been officially removed from the C++17 standard library. And you'd be right about sprintf--anyone who needs that functionality is better of using the range-checked version snprintf, which is analogous to a "two-range algorithm" in this context. But about strcpy I'd disagree. With gets it is impossible to know the correct size for the output buffer; with sprintf it is difficult; but with strcpy it is trivial: you just measure the strlen of the input buffer and that's your answer. Likewise with std::copy, the relationship between "input elements consumed" and "output elements produced" is exactly one-to-one, so sizing the output buffer doesn't present a technical challenge.

Notice that the parameter we called destination is an output iterator. This means that we can use std::copy, not merely to shunt data around in memory, but even to feed data to an arbitrary "sink" function. For example:

    class putc_iterator : public boost::iterator_facade<
putc_iterator, // T
const putc_iterator, // value_type
std::output_iterator_tag
>
{
friend class boost::iterator_core_access;

auto& dereference() const { return *this; }
void increment() {}
bool equal(const putc_iterator&) const { return false; }
public:
// This iterator is its own proxy object!
void operator= (char ch) const { putc(ch, stdout); }
};

void test()
{
std::string s = "hello";
std::copy(s.begin(), s.end(), putc_iterator{});
}

You may find it instructive to compare this version of our putc_iterator to the version from Chapter 2, Iterators and Ranges; this version is using boost::iterator_facade as introduced at the end of Chapter 2, Iterators and Ranges and also using a common trick to return *this instead of a new proxy object.

Now we can use the flexibility of destination to solve our concerns about buffer overflow! Suppose that, instead of writing into a fixed-size array, we were to write into a resizable std::vector (see Chapter 4, The Container Zoo). Then "writing an element" corresponds to "pushing an element back" on the vector. So we could write an output iterator very similar to putc_iterator, that would push_back instead of putc, and then we'd have an overflow-proof way of filling up a vector. Indeed, the standard library provides just such an output iterator, in the <iterator> header:

    namespace std {
template<class Container>
class back_insert_iterator {
using CtrValueType = typename Container::value_type;
Container *c;
public:
using iterator_category = output_iterator_tag;
using difference_type = void;
using value_type = void;
using pointer = void;
using reference = void;

explicit back_insert_iterator(Container& ctr) : c(&ctr) {}

auto& operator*() { return *this; }
auto& operator++() { return *this; }
auto& operator++(int) { return *this; }

auto& operator= (const CtrValueType& v) {
c->push_back(v);
return *this;
}
auto& operator= (CtrValueType&& v) {
c->push_back(std::move(v));
return *this;
}
};

template<class Container>
auto back_inserter(Container& c)
{
return back_insert_iterator<Container>(c);
}
}

void test()
{
std::string s = "hello";
std::vector<char> dest;
std::copy(s.begin(), s.end(), std::back_inserter(dest));
assert(dest.size() == 5);
}

The function call std::back_inserter(dest) simply returns a back_insert_iterator object. In C++17, we could rely on template type deduction for constructors and write the body of that function as simply return std::back_insert_iterator(dest); or dispense with the function entirely and just write std::back_insert_iterator(dest) directly in our code--where C++14 code would have to "make do" with std::back_inserter(dest). However, why would we want all that extra typing? The name back_inserter was deliberately chosen to be easy to remember, since it's the one that we were expected to use most often. Although C++17 allows us to write std::pair in place of std::make_pair, and std::tuple in place of std::make_tuple, it would be silly to write the cumbersome std::back_insert_iterator in place of std::back_inserter. You should prefer std::back_inserter(dest) even in C++17.

主站蜘蛛池模板: 锡林浩特市| 肃宁县| 河津市| 阿勒泰市| 正安县| 加查县| 体育| 施甸县| 万安县| 徐州市| 克拉玛依市| 西峡县| 新民市| 城口县| 五峰| 蒙山县| 石家庄市| 鲁甸县| 敦化市| 什邡市| 辛集市| 宜春市| 汾西县| 闽侯县| 岑巩县| 乌拉特后旗| 长丰县| 丰顺县| 阿瓦提县| 保定市| 青浦区| 伊川县| 乐业县| 利津县| 商城县| 金门县| 宁城县| 孟连| 娱乐| 加查县| 资阳市|