Baduit

A young French developper who really likes (modern) C++

About me
12 August 2023

Improve the manipulation of a string at compile time with C++20

by Baduit

Article::Article

In my last article I showed you how to generate a string during the compilation. In my code I used a std::array<char, Size> to return a string, but honestly I must admit that a std::array is not as convenient as real string class.

That’s why I will show you how to create a trivial compile time string class.

The layout

The only thing I need to store is an array of char with a constant size, so let’s write it:

1
2
3
4
5
template <std::size_t ArraySize>
struct CompileTimeString
{
    char data[ArraySize];
}

I could have use a std::array but honestly in this case it wouldn’t really have change anything.

Construct it easily

Writing this is not practical:

1
2
// 4 because the \0 is stored
CompileTimeString<4> str = "lol";

It’s possible to write something a bit better like this:

1
2
constexpr CompileTimeString str {"lol"};
constexpr CompileTimeString str2  = {"lol"};

What if I want to write this:

1
constexpr CompileTimeString str = "lol";

Then I must add this constructor:

1
2
3
4
5
6
// It takes an array of chars in parameter
constexpr CompileTimeString(const char(&literal)[ArraySize]) noexcept
{
    // Then it copies the content
    std::ranges::copy(literal, data);
}

It’s possible do even better thanks to user defined literals.

If you don’t what is it I suggested you to read one of these articles, this one in French or this one in English.

To explain it in one sentence, I will be able to write

1
2
// hi's type is CompileTimeString<14>
constexpr auto hi = "Hello, World!"_cts;

How?
Like this:

1
2
3
4
5
template<CompileTimeString Str>
constexpr auto operator""_cts()
{
    return Str;
}

With C++20 it is possible to define a user defined literal with a template parameter that can be constructed from an array of char or a const char* and we did it just before. Amazing, isn’t it?

Really basic iterator stuff

Now that we can construct it, it would be nice to be able to use some algorithm or use it as a range like normal thing.

So we need to implement member functions like begin() and end(). Don’t worry, it will be easy because the standard library works well with built in arrays, it is possible to just to this:

1
2
3
4
5
6
7
8
9
constexpr auto begin() noexcept
{
    return std::begin(data); 
}

constexpr auto end() noexcept
{
    return std::end(data);
}

It can also be done for const iterators and reverse iterators.

Conversion to std::string_view

A lot of existing functions in existing code use std::string_view to handle not owning view of strings, it is simple to use and can be constructed from a std::string or a const char* for example making it really practical.

Creating a function to be able to construct a std::string_view from a CompileTimeString is really easy:

1
2
3
4
5
constexpr std::string_view as_string_view() const
{
    // Remove 1 because the \0 is not counted is the size in a std::stringi_view
    return std::string_view(data, ArraySize - 1);
}

There’s just one issue, it’s really easy to shoot yourself in the foot with this. Let me explain, std::string_view is a non-owning view, meaning that it points to something and the thing it points to must still be valid. If the view is created from a temporary, it’s easy to use the view after the lifetime of the original string, creating bugs and undefined behavior.

1
2
3
4
5
6
7
// This one is ok, even if the std::string is temporary, it will be destroyed after the end of the full expression
std::cout << std::string_view(std::string("hi")) << std::endl;

// This one is not ok
std::string_view str = std::string("not good");
// Here the temporary std::string is already destroyed
std::cout << str << std::endl;

I don’t really see a valid usecase of creating a std::string_view from a temporary CompileTimeString, so I will forbid it.

In C++, you can have different overloads of a method depending if it is a right value (a temporary for example, or a moved value) or a left value.

It’s used in the standard library for std::string::substr() since C++23, if we have a lvalue (left value) it creates a new std:string, if we have a rvalue (right value) it can reuse the storage of the current string instead of allocating a new one.

With lvalue the substr is equivalent to:

1
return basic_string(*this, pos, count)

And with a rvalue it is the equivalent to:

1
return basic_string(std::move(*this), pos, count);

To do it you just need to add a & for a left value and && for a right value.

In my case, I will just write this:

1
2
3
4
5
6
constexpr std::string_view as_string_view() const &
{
	return std::string_view(data, ArraySize - 1);
}

constexpr std::string_view as_string_view() const && = delete;

Now that’s done, I will write a conversion operator

1
2
3
4
5
6
constexpr operator std::string_view() const &
{
    return as_string_view();
}

constexpr operator std::string_view() const && = delete;

Yes, I know, the syntax feels a bit weird because there is no return type and the name of the function is the return type.

I could have marked it explicit and I think most of the time explicit conversion is better than the implicit one, but in this case I think that the implicit conversion of strings to string views is pretty common and expected, also it is already the behavior for std::string so at least it is coherent.

Add some basic member functions

Strings classes have a lot of utility member functions, but recoding them can be redundant, but as we can create a std::string_view from our class, let’s use it in order to make it really simple to implement these methods.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
constexpr bool starts_with(std::string_view sv) const noexcept
{
	return as_string_view().starts_with(sv);
}

constexpr bool starts_with(char ch) const noexcept
{
	return as_string_view().starts_with(ch);
}

constexpr bool starts_with(const char* s) const
{
	return as_string_view().starts_with(s);
}

It’s not the most exciting task but at least there’s little room for error.

Concatenate compile time string with operator+

Concatenating strings is something really practical that we do a lot. std::string uses the operator +, I can do the same here, it is just a simple operator overload:

1
2
3
4
5
6
7
8
9
10
11
12
template <std::size_t OtherSize>
constexpr auto operator+(const CompileTimeString<OtherSize>& other) const noexcept
{
	// Combine both size to have the size of the array of the new string
	CompileTimeString<ArraySize + OtherSize - 1> result; // -1 because one the \0 will be overwritten
	std::ranges::copy(data, result.data); // Copy the first string
	std::ranges::copy(other.data, result.data + ArraySize - 1); // Copy the second one
	return result;
}

constexpr auto concat = "Hello, "_cts + "World!"_cts;
static_assert(concat == "Hello, World!"_cts);

You can have the full code on compiler explorer here.

Article::~Article

This string class is not totally complete, I could add a lot of features in it, but it would be only grinding features and I don’t plan to make a library for it, but there’s a gist here if you want to use or complete the code. With this you would have a strong base, to add member functions like a substr, or have the char type templated like the the std::basic_string and std::basic_string_view.

Sources

tags: cpp - cpp20