STL From Scratch - Utilities and Integer Sequences

update on Sept. 29: some small changes to code organization

If this is your first time reading the series you should first read the introduction. Otherwise, enjoy!

Looking at the plan of action for the first post, which was originally supposed to be titled "pairs, tuples, and integer sequences", I decided that it would be best to start of fast with an implementation of the <utility> header, sans pairs, that way we can also see how move, forward, swap, and exchange are implemented. This is mostly an excuse to publish the first post as quickly as possible, but it also has a few benefits. For one, we'll already have index_sequence implemented for next time when we'll need it for our implementation of tuples. It also has the added benefit of allowing me to get the testing infrastructure up and running. If you take a look at the repository you will see that I have added integration for Travis-CI builds, a Makefile, and some other useful files for testing our implementation.

Alright, now on to the content. So as not to conflict with namespace std, we will be placing all of our code in namespace stl. Just remember that we'll be using code from the <type_traits> header. First up is move/move_if_noexcept and forward.

The move operation converts lvalues to rvalues, which is to say it merely casts its argument to an rvalue reference (technically speaking it converts its argument to an xvalue expression, see the value categories article at cppreference for details on what an xvalue expression is). Why would we want to do this? One common reason is to avoid unnecessary (and potentially expensive) copies of data structures by moving the resources from one owner to another. The upshot is that by using move we can signal to a constructor or method that they are free to use the resources however they would like, which has the potential to be more efficient. On the other hand, as the callers we have no guarantee about the ultimate state of our resource after the call.

(As a side note, prior to guaranteed return-value optimization (RVO), moving out of functions was a potential optimization, but with RVO this is mostly unnecessary.)

The quintessential example is of std::vector. Let's say we have a class that holds a std::vector <int> data mebmer. Since we would like to own this resource, in construction of our class we need to take hold of the contents of a std::vector <int> in one way or another. The first way to do so may be by copy.

class resource
{
    // ...
    std::vector <int> v;

public:
    // Here the contents of v_copy is copied into v, which
    // may be expensive if v_copy is large.
    resource (std::vector <int> const & v_copy)
        : v (v_copy)
    {}

    //...
};

Alternatively, since resource owns a vector anyways, consumers of our class might wish to release ownership of a std::vector <int>, transfering its contents, as part of a resource object's construction. The way we can make this possible is by providing a constructor for resource that takes an rvalue-reference to a std::vector <int>. Now, from a consumer's perspective.

std::vector <int> copy_arg;
std::vector <int> move_arg;
// ... do things with both
// now construct resources

// this copies the contents as the first constructor
// is selected by overload resolution
resource r_from_copy (copy_arg);

// by casting to an rvalue reference using move, this
// moves the contents as the second constructor is selected
// by overload resolution
resource r_from_move (std::move (move_arg));

Strictly speaking this is not enough. In our constructor below the parameter v_move can bind to another rvalue reference (as used above by the consumer of resource) or to a temporary object of std::vector <int>, but it itself is a named value with ownership of its contents. Therefore, we must move again in order to transfer the contents to our data member v.

class resource
{
    // ...
    std::vector <int> v;

public:
    // Here v_copy is again copied into v
    resource (std::vector <int> const & v_copy)
        : v (v_copy)
    {}

    // The alternative is to provide a constructor taking
    // an rvalue-reference to a std::vector <int>.
    //
    // In this case the construction is inexpensive (just
    // a few pointer swaps). But notice the call to move.
    // v_move owns its contents, so we have to move again.
    resource (std::vector <int> && v_move)
        : v (std::move (v_move))
    {}

    //...
};

That's more or less it for move, as most use cases follow the above pattern. Now here's the code for move itself.

template <class T>
constexpr typename std::remove_reference <T>::type &&
    move (T && t) noexcept
{
    return static_cast <
        typename std::remove_reference <T>::type &&
    > (t);
}

The move_if_noexcept operation converts lvalues to rvalues, if it is the case that T is nothrow move constructible, and otherwise converts lvalues to lvalue references to const. There is one technicality, however, we convert to an lvalue reference to const only in the case that T is also copy constructible.

template <class T>
constexpr typename std::conditional <
    !std::is_nothrow_move_constructible <T>::value &&
    std::is_copy_constructible <T>::value,
    T const &,
    T &&
>::type move_if_noexcept (T & t) noexcept
{
    using forward_type = typename std::conditional <
        !std::is_nothrow_move_constructible <T>::value &&
        std::is_copy_constructible <T>::value,
        T const &,
        T &&
    >::type;

    return static_cast <forward_type> (t);
}

forward is similar to move, except that it conditionally converts lvalues to rvalues, depending on the type of the provided template parameter T. To understand how forward works we'll also have to know what reference collapsing and forwarding references are. For in-depth explanations of these you should read this article by Scott Meyers and perhaps even take a look at this document. The important thing to know is that forward always returns a value in the same value category as its argument.

template <class T>
constexpr T && forward (typename std::remove_reference <T>::type & t)
    noexcept
{
    /*
     * What to do with an lvalue? We want to return in the same value
     * category, so forward `t` as an lvalue. Notice the difference between
     * this and the implementation of move. By writting `T &&` explicitly
     * the expression undergoes reference collapsing.
     */
    return static_cast <T &&> (t);
}

template <class T>
constexpr
typename std::enable_if <!std::is_lvalue_reference <T>::value, T &&>::type
    forward (typename std::remove_reference <T>::type && t) noexcept
{
    /*
     * What to do with an rvalue? We want to return in the same value
     * category, so forward `t` as an rvalue. Notice the difference between
     * this and the implementation of move. By writting `T &&` explicitly
     * the expression undergoes reference collapsing.
     *
     * There is one caveat: we cannot forward an rvalue as an lvalue,
     * hence the enable_if in the declaration. If we try to forward an
     * rvalue as an lvalue this will fail to compile.
     */
    return static_cast <T &&> (t);
}

There are two generic flavors of swap that we'll implement (many more will be implemented later as specializations). The first swaps single values, and the second swaps arrays of values. Both implementations are staightforward -- we simply switch the values in our two arguments using move's. You will also notice the use of stl::declval, the definition of which you can find in the repository.

update on Sept. 29: previously, swap on arrays incorrectly used stl:: qualified swap implementation; now changed to use unqualified version allowing for ADL.

update on Sept. 30: thank you to /u/cpp_learner in the reddit discussion for bringing up the issue of the noexcept specification of the array-overload of swap. You can read about the issues and resolution (accepted into C++17) here and here.

template <class T>
void swap (T & a, T & b)
    noexcept (std::is_nothrow_move_constructible <T>::value &&
              std::is_nothrow_move_assignable <T>::value)
{
    T temp {stl::move (a)};
    a = stl::move (b);
    b = stl::move (temp);
}

template <class T, std::size_t N>
void swap (T (&a_array) [N], T (&b_array) [N])
    noexcept (noexcept (
        swap (stl::declval <T &> (), stl::declval <T &> ())
    ))
{
    for (std::size_t i = 0; i < N; ++i) {
        swap (a_array [i], b_array [i]);
    }
}

Our last function is exchange. This one is similar to swap, except that it returns the value held previously in its first argument. Again, its implementation is straightforward.

template <class T, class U = T>
T exchange (T & a, U && val)
{
    T ret {stl::move (a)};
    a = stl::forward <U> (val);
    return ret;
}

Alright, we're alomst done with this post. All we have left to implement is integer_sequence and index_sequence. The later is a special case of the former where the value type is std::size_t. Later on we will see that index_sequences are invaluable in template programming when we wish to use parameter packs in a structured way. If you're interested in seeing a use-case right now, this cppreference page contains a good motivating example at the bottom.

For clarity, our implementation of the sequence generation is going to use the naive O(N) recursive generation technique, but in practice many implementations use a split-and-merge technique that has an O(log N) runtime. We'll place the sequence generation code in the file stl/bits/sequence_generator.hpp and have it wrapped in namespace bits (contained within namespace stl) so that we don't pollute the public API. You can see the full organization of the code repository.

namespace stl
{
    /* forward declaration of integer sequence */
    template <class T, T ... Is>
    class integer_sequence;

namespace bits
{
    template <class T, std::size_t N, T ... Is>
    struct sequence_generator
    {
        using type = typename sequence_generator <
            T, N - 1, T {N - 1}, Is...
        >::type;
    };

    template <class T, T ... Is>
    struct sequence_generator <T, 0, Is...>
        : integer_sequence <T, Is...>
    {};

    template <class T, T N>
    struct sequence_generator_helper
    {
        static_assert (
            N >= 0,
            "cannot produce sequence of negative length"
        );

        using type = typename sequence_generator <
            T, std::size_t {N}
        >::type;
    };
}   // namespace bits
}   // namespace stl

And now back in the file stl/utility.hpp we'll write the implementations of integer_sequence and index_sequence.

template <class T, T ... Is>
class integer_sequence
{
public:
    using type = integer_sequence;
    using value_type = T;

    static constexpr std::size_t size (void) noexcept
    {
        return sizeof... (Is);
    }
};

template <std::size_t ... Is>
using index_sequence = integer_sequence <std::size_t, Is...>;

template <typename T, T N>
using make_integer_sequence =
    typename bits::sequence_generator_helper <T, N>::type;

template <std::size_t N>
using make_index_sequence = make_integer_sequence <std::size_t, N>;

That's it! We've implemented the entire <utility> header (except for pairs, which we'll implement next time along with tuples). You can find all the code from this post in the repo here.