官术网_书友最值得收藏!

Shunting data with std::copy

We've just seen our first few two-range algorithms. The <algorithm> header is full of two-range algorithms and their siblings, the one-and-a-half-range algorithms. What's the simplest possible such algorithm?

A reasonable answer would be: "Copy each data element from the first range into the second range." Indeed, the STL provides that algorithm, under the name std::copy:

    template<class InIt, class OutIt>
OutIt copy(InIt first1, InIt last1, OutIt destination)
{
while (first1 != last1) {
*destination = *first1;
++first1;
++destination;
}
return destination;
}

Notice that this is a one-and-a-half-range algorithm. The standard library actually does not provide a two-range version of std::copy; the assumption is that if you are actually trying to write into a buffer, then you must have checked its size already, so checking "are we at the end of the buffer yet" inside the loop would be both redundant and inefficient.

Now I can practically hear you exclaiming: "Horrors! This is the same crude logic that brought us strcpy, sprintf, and gets! This is an invitation to buffer overflows!" Well, if you were to exclaim thusly, you'd be right about the bad behavior of gets--in fact, the gets function has been officially removed from the C++17 standard library. And you'd be right about sprintf--anyone who needs that functionality is better of using the range-checked version snprintf, which is analogous to a "two-range algorithm" in this context. But about strcpy I'd disagree. With gets it is impossible to know the correct size for the output buffer; with sprintf it is difficult; but with strcpy it is trivial: you just measure the strlen of the input buffer and that's your answer. Likewise with std::copy, the relationship between "input elements consumed" and "output elements produced" is exactly one-to-one, so sizing the output buffer doesn't present a technical challenge.

Notice that the parameter we called destination is an output iterator. This means that we can use std::copy, not merely to shunt data around in memory, but even to feed data to an arbitrary "sink" function. For example:

    class putc_iterator : public boost::iterator_facade<
putc_iterator, // T
const putc_iterator, // value_type
std::output_iterator_tag
>
{
friend class boost::iterator_core_access;

auto& dereference() const { return *this; }
void increment() {}
bool equal(const putc_iterator&) const { return false; }
public:
// This iterator is its own proxy object!
void operator= (char ch) const { putc(ch, stdout); }
};

void test()
{
std::string s = "hello";
std::copy(s.begin(), s.end(), putc_iterator{});
}

You may find it instructive to compare this version of our putc_iterator to the version from Chapter 2, Iterators and Ranges; this version is using boost::iterator_facade as introduced at the end of Chapter 2, Iterators and Ranges and also using a common trick to return *this instead of a new proxy object.

Now we can use the flexibility of destination to solve our concerns about buffer overflow! Suppose that, instead of writing into a fixed-size array, we were to write into a resizable std::vector (see Chapter 4, The Container Zoo). Then "writing an element" corresponds to "pushing an element back" on the vector. So we could write an output iterator very similar to putc_iterator, that would push_back instead of putc, and then we'd have an overflow-proof way of filling up a vector. Indeed, the standard library provides just such an output iterator, in the <iterator> header:

    namespace std {
template<class Container>
class back_insert_iterator {
using CtrValueType = typename Container::value_type;
Container *c;
public:
using iterator_category = output_iterator_tag;
using difference_type = void;
using value_type = void;
using pointer = void;
using reference = void;

explicit back_insert_iterator(Container& ctr) : c(&ctr) {}

auto& operator*() { return *this; }
auto& operator++() { return *this; }
auto& operator++(int) { return *this; }

auto& operator= (const CtrValueType& v) {
c->push_back(v);
return *this;
}
auto& operator= (CtrValueType&& v) {
c->push_back(std::move(v));
return *this;
}
};

template<class Container>
auto back_inserter(Container& c)
{
return back_insert_iterator<Container>(c);
}
}

void test()
{
std::string s = "hello";
std::vector<char> dest;
std::copy(s.begin(), s.end(), std::back_inserter(dest));
assert(dest.size() == 5);
}

The function call std::back_inserter(dest) simply returns a back_insert_iterator object. In C++17, we could rely on template type deduction for constructors and write the body of that function as simply return std::back_insert_iterator(dest); or dispense with the function entirely and just write std::back_insert_iterator(dest) directly in our code--where C++14 code would have to "make do" with std::back_inserter(dest). However, why would we want all that extra typing? The name back_inserter was deliberately chosen to be easy to remember, since it's the one that we were expected to use most often. Although C++17 allows us to write std::pair in place of std::make_pair, and std::tuple in place of std::make_tuple, it would be silly to write the cumbersome std::back_insert_iterator in place of std::back_inserter. You should prefer std::back_inserter(dest) even in C++17.

主站蜘蛛池模板: 凭祥市| 彰武县| 永州市| 灌阳县| 温州市| 安西县| 平和县| 玉树县| 垦利县| 通许县| 鄯善县| 洞口县| 格尔木市| 浠水县| 荔波县| 改则县| 西吉县| 冕宁县| 柳州市| 龙口市| 依安县| 育儿| 彭山县| 凯里市| 惠来县| 平度市| 修武县| 慈溪市| 天水市| 广丰县| 连南| 上犹县| 沙田区| 永善县| 政和县| 灵武市| 上虞市| 抚远县| 阿克陶县| 富蕴县| 尖扎县|