Fixing optional’s Constructors and Comparison Operators
What should happen when you construct an optional<T> from a value of type U?
This is more tricky than it may seem. The standard got it wrong initially, and
their fix is still slightly wrong! Barry Revzin provides an excellent rundown
on this topic in his post
“Getting in trouble with mixed construction”.
I highly recommend reading it before continuing on.
Desired Behavior
If both T and U are not optionals, the result is probably what you would
expect:
uint16_t value = 42;
optional<uint32_t> opt = value;
Mapping each source value to the corresponding target value gives this diagram:
block-beta
columns 6
o1["u16:"]:1
block:group2:5
space
o1v0["0"]
o1v1["1"]
o1v3["..."]
o1v5["65535"]
space
space
space
end
o2["opt<u32>:"]:1
block:group3:5
o2n00["∅"]
o2v00["0"]
o2v01["1"]
o2v03["..."]
o2v05["65535"]
o2v06["65536"]
o2v07["..."]
o2v08["2<sup>32</sup>-1"]
end
o1v0-->o2v00
o1v1-->o2v01
o1v3-->o2v03
o1v5-->o2v05
style o1 fill:#fff0,stroke:#fff0
style o2 fill:#fff0,stroke:#fff0
style o1v3 fill:#fff0,stroke:#fff0
style o2v03 fill:#fff0,stroke:#fff0
style o2v07 fill:#fff0,stroke:#fff0
This conversion can be done implicitly (i.e. with the = sign), as it is
lossless and nothing unexpected happens.
One key observation is that a conversion of this shape can always be done if the
number of “optional layers” of the target type is bigger than that of the
source type.
For example:
optional<uint16_t> value = 42;
optional<optional<optional<uint32_t>>> opt = value;
Visualized:
block-beta
columns 6
o1["opt<u16>:"]:1
block:group2:5
space
space
o1n0["∅"]
o1v0["0"]
o1v3["..."]
o1v5["65535"]
space
space
end
o2["opt<opt<opt<u32>>>:"]:1
block:group3:5
o2n02["∅<sub>2</sub>"]
o2n01["∅<sub>1</sub>"]
o2n00["∅<sub>0</sub>"]
o2v00["0"]
o2v03["..."]
o2v05["65535"]
o2v07["..."]
o2v08["2<sup>32</sup>-1"]
end
o1n0-->o2n00
o1v0-->o2v00
o1v3-->o2v03
o1v5-->o2v05
style o1 fill:#fff0,stroke:#fff0
style o2 fill:#fff0,stroke:#fff0
style o1v3 fill:#fff0,stroke:#fff0
style o2v03 fill:#fff0,stroke:#fff0
style o2v07 fill:#fff0,stroke:#fff0
Viewing it like this, making the “base type” wider (in this example using a
uint64_t instead of a uint32_t) adds new possible values on the right,
while adding another “optional layer” adds one additional “none value” on the
left (naming them “∅0”, “∅1”, “∅2”, and so
on).
There should be no surprise when extending this mental model for the case when
the number of optional layers is the same:
optional<uint16_t> value = 42;
optional<uint32_t> opt = value;
block-beta
columns 6
o1["opt<u16>:"]:1
block:group2:5
o1n0["∅"]
o1v0["0"]
o1v1["1"]
o1v3["..."]
o1v5["65535"]
space
space
space
end
o2["opt<u32>:"]:1
block:group3:5
o2n00["∅"]
o2v00["0"]
o2v01["1"]
o2v03["..."]
o2v05["65535"]
o2v06["65536"]
o2v07["..."]
o2v08["2<sup>32</sup>-1"]
end
o1n0-->o2n00
o1v0-->o2v00
o1v1-->o2v01
o1v3-->o2v03
o1v5-->o2v05
style o1 fill:#fff0,stroke:#fff0
style o2 fill:#fff0,stroke:#fff0
style o1v3 fill:#fff0,stroke:#fff0
style o2v03 fill:#fff0,stroke:#fff0
style o2v07 fill:#fff0,stroke:#fff0
It should even work when the target base type is “narrower” than the source base type:
optional<uint16_t> value = 42;
optional<bool> opt{value};
We used braces to make it clear to the reader that something “unusual” happened
here: the narrowing conversion from uint16_t to bool. Still, the “none”
value of the source type should definitely map to the “none” value of the
target!
block-beta
columns 6
o1["opt<u16>:"]:1
block:group3:5
o1n00["∅"]
o1v00["0"]
o1v01["1"]
o1v02["2"]
o1v03["3"]
o1v04["..."]
o1v05["65534"]
o1v06["65535"]
end
o2["opt<bool>:"]:1
block:group2:5
o2n0["∅"]
o2v0["false"]
o2v1["true"]
space
space
space
space
space
end
o1n00-->o2n0
o1v00-->o2v0
o1v01-->o2v1
o1v02-->o2v1
o1v03-->o2v1
o1v04-->o2v1
o1v05-->o2v1
o1v06-->o2v1
style o1 fill:#fff0,stroke:#fff0
style o2 fill:#fff0,stroke:#fff0
style o1v04 fill:#fff0,stroke:#fff0
What the Standard Gets Wrong
The above should also work if the target base type is std::any. std::any is
a value type, containing (std::type_info, “value of type T”) pairs for any
copyable type T as its values 1.
block-beta
columns 6
o1["opt<u16>:"]:1
block:group3:5
o1n00["∅"]
o1v00["0"]
o1v01["1"]
o1v02["2"]
o1v03["..."]
o1v04["65534"]
o1v05["65535"]
space
end
o2["opt<any>:"]:1
block:group2:5
o2n0["∅"]
o2v0["u16,\n0"]
o2v1["u16,\n1"]
o2v2["u16,\n2"]
o2v3["..."]
o2v4["u16,\n65534"]
o2v5["u16,\n65535"]
o2v6["..."]
end
o1n00-->o2n0
o1v00-->o2v0
o1v01-->o2v1
o1v02-->o2v2
o1v04-->o2v4
o1v05-->o2v5
style o1 fill:#fff0,stroke:#fff0
style o2 fill:#fff0,stroke:#fff0
style o1v03 fill:#fff0,stroke:#fff0
style o2v3 fill:#fff0,stroke:#fff0
style o2v6 fill:#fff0,stroke:#fff0
Sadly, this is not what happens in the standard. std::optional maps the values
like the following, which is inconsistent compared to the behavior of the
optional<uint16_t> → optional<uint32_t> conversion above:
block-beta
columns 6
o1["opt<u16>:"]:1
block:group3:5
space
o1n00["∅"]
o1v00["0"]
o1v01["1"]
o1v03["..."]
o1v04["65534"]
o1v05["65535"]
space
end
o2["opt<any>:"]:1
block:group2:5
o2n0["∅"]
o2v0["opt<u16>,\n∅"]
o2v1["opt<u16>,\n0"]
o2v2["opt<u16>,\n1"]
o2v4[".."]
o2v5["opt<u16>,\n65534"]
o2v6["opt<u16>,\n65535"]
o2v7["..."]
end
o1n00-->o2v0
o1v00-->o2v1
o1v01-->o2v2
o1v04-->o2v5
o1v05-->o2v6
style o1 fill:#fff0,stroke:#fff0
style o2 fill:#fff0,stroke:#fff0
style o1v03 fill:#fff0,stroke:#fff0
style o2v4 fill:#fff0,stroke:#fff0
style o2v7 fill:#fff0,stroke:#fff0
This happens because std::optional has two “kinds” of constructors, and the
logic that chooses between them is slightly broken. Those two “kinds” are:
-
Its “value” constructor, constructing an always engaged
optionalfrom a value. Simplified:template<class U> optional(const U& u) : t_(u), is_engaged_{true} { } -
Its “converting” 2 constructor, taking another
optionaland setting its own “engaged” flag accordingly. Simplified:template<class U> optional(const optional<U>& u) : is_engaged_{u.has_value()} { if (is_engaged_) { new (&t_) T(*u); } }
The default C++ overload resolution rules will prefer the second (converting)
constructor for arguments of some optional type, because it is more
specialized. This is not always what we want, though. So the standard added
a
bunch
of
rules on top, trying to nudge the compiler in the “right” direction. Sadly,
those fixes are incomplete and have weird behavior on their own.
A Possible Fix
To really get hold of the problem we can try listing all possible cases when
trying to convert a U to an optional<T>. Doing a pattern match on both U
and T leads to four cases which I list in the table below. Why four? Both T
and U can either be another optional type or not. So we
have
possible combinations.
U/T: types that can be anything, even anotheroptionalUV/TV: types that are notoptional- →: “converts to”
| case | U → opt<T> |
constructor used |
|---|---|---|
| 1 | UV → opt<TV> |
“value” |
| 2 | UV → opt<opt<T>> |
“value” |
| 3 | opt<U> → opt<TV> |
“converting” |
| 4a | opt<U> → opt<opt<T>> |
“value” when U → opt<T> is “value” |
| 4b | opt<U> → opt<opt<T>> |
“converting” when U → opt<T> is “converting” |
- case 1: This is the simplest case, where we have a non-
optionaltype trying to convert to aoptionalof a non-optionaltype. Example:int→optional<long>. This will work ifintconverts tolong(which in this example it will). - case 2: This is like case 1, just that at least one additional
“
optionallayer” is involved in the target type. This recurses to case 1 or 2. - case 3: Here, we are trying to convert from an
optional<U>to anoptional<TV>, whereTVis not anoptional, butUmight be. This is, again, (hopefully) uncontroversial: It is the “converting” case. Note that we don’t want to allow any weird conversions tobool, for exampleoptional<vector<int>>→optional<bool>. This should not be allowed to compile, just like trying to convert avector<int>to abooldoes not compile. - case 4: This is an interesting case. First, we can strip away one layer of
optionalfrom each side, giving us againU→opt<T>. We can ask the table for an answer recursively:- case 4a: If the answer is “value”, we know that the inner
Uconverts to the innerT. But we have anopt<U>, not just aU. We have two possibilities to map the “none” value of theopt<U>toopt<opt<T>>: The “outer” none value or the “inner” none value. To me, mapping it to the “inner” none value feels like the more natural choice. Taken together, we map the wholeopt<U>to the inneropt<T>, resulting in a “value” case. Example:optional<int>→optional<optional<long>>is “value”- …as
int→optional<long>gives “value” - …so we can convert
optional<int>tooptional<long>, mapping our none value to the target none value - …so
optional<int>converts to an always engagedoptional<optional<long>>
- case 4b: If the answer is “converting”, we know that the inner
Umaps toopt<T>. Thus, to map our “none” value, there is just one possibility left: The “outer” none value of theopt<opt<T>>. This results in a “converting” case. Example:optional<optional<int>>→optional<optional<long>>is “converting”- …as
optional<int>→optional<long>is “converting” - …so we can add an
optionallayer to both sides, and the answer stays “converting”
- case 4a: If the answer is “value”, we know that the inner
Looking at the table, we can even recognize our earlier intuition that the
“value” case is equivalent to the case where the number of “optional layers”
of the target is greater than the source:
| case | U → opt<T> |
constructor used | number of optional layers (target vs. source) |
|---|---|---|---|
| 1 | UV → opt<TV> |
“value” | > |
| 2 | UV → opt<opt<T>> |
“value” | > |
| 3 | opt<U> → opt<TV> |
“converting” | ≤ |
| 4a | opt<U> → opt<opt<T>> |
“value” | > |
| 4b | opt<U> → opt<opt<T>> |
“converting” | ≤ |
Cases 1, 2 and 3 are the base cases, and both 4a and 4b just add an optional
layer to each side, so the relation between the number of optional layers
stays the same.
So we can simplify the whole thing and just ask: “Is the number of optional
layers of the target type greater than that of the source type?” If yes, choose
the “value” constructor.
I think this is a very simple and intuitive rule, but it also means that we
cannot simply use C++’s normal overload resolution rules for our constructors.
We have to take the number of optional layers of our T into account as
well.
Inside our optional<T>, the check for the “value”/”converting” constructor
could look like this:
template <class U>
static constexpr bool use_value_constructor =
count_optional_layers<T>() >= count_optional_layers<U>();
This corresponds to cases 1, 2 and 4a of the table. I use >= as this logic
is inside a optional<T>, so one optional layer is already taken into
account.
All together, our two constructors look like this (simplified a bit). Both constructors take a forwarding reference as their argument, taking the problematic overload resolution rule out of the picture, and letting us “program” the rules ourselves:
//
// "value" constructor
//
template <class U>
requires(use_value_constructor<U> && std::constructible_from<T, U>)
constexpr optional(U&& u)
: /* Construct engaged `optional` from `u`. */;
//
// "converting" constructor
//
template <class U, class UV = optional_value_type_t<U>>
requires(!use_value_constructor<U> && std::constructible_from<T, UV>)
constexpr optional(U&& u) //
: /* ... */
if (!u.has_value()) {
/* Construct disengaged `optional`. */
} else {
/* Construct engaged `optional` from `*u`. */
}
/* ... */;
Those two constructors will always do the (in my opinion) sensible thing. Even
if this might invoke optional’s explicit operator bool() in a case where
the number of “optional layers” is reduced:
optional<optional<uint32_t>> value = 42;
optional<bool> opt{value}; // note the braces, signalling an explicit conversion!
Visualized:
block-beta
columns 6
o1["opt<opt<u32>>:"]:1
block:group3:5
o1n01["∅<sub>1</sub>"]
o1n00["∅<sub>0</sub>"]
o1v00["0"]
o1v01["1"]
o1v02["2"]
o1v03["..."]
o1v07["2<sup>32</sup>-2"]
o1v08["2<sup>32</sup>-1"]
end
o2["opt<bool>:"]:1
block:group2:5
o2n0["∅"]
o2v0["false"]
o2v1["true"]
space
space
space
space
space
end
o1n01-->o2n0
o1n00-->o2v0
o1v00-->o2v1
o1v01-->o2v1
o1v02-->o2v1
o1v03-->o2v1
o1v07-->o2v1
o1v08-->o2v1
style o1 fill:#fff0,stroke:#fff0
style o2 fill:#fff0,stroke:#fff0
style o1v03 fill:#fff0,stroke:#fff0
This might seem weird at first, but it is just a consequence of optional
having an explicit conversion to bool, thus falling into case 3 of the table.
Mixed Comparisons
Closely related to initializing optionals there is the problem of comparing
them. Why is this related to initialization? One useful mental model of
comparing two different types in C++ is that this is valid if and only if one
type implicitly converts into the other. As implicit conversions should be
lossless, this always does what you expect.
In practice, doing the work of initializing a temporary optional would be wasteful, though. So we might want to define our own comparison operators. This works, but we must ensure that the behavior is identical to the idealized mental model above. Otherwise, unexpected things can happen.
Barry Revzin gave some very good examples in his post “Getting in trouble with mixed comparisons”.
Let’s visualize what should happen. The first non-obvious example is f5():
bool f5(optional<int> a, long b) {
return a == b;
}
Here are the types involved:
block-beta
columns 6
o1["long:"]:1
block:group2:5
space
o1v0["0"]
o1v1["1"]
o1v2["..."]
o1v4["INT_-\nMAX"]
o1v5["INT_-\nMAX+1"]
o1v6["..."]
o1v7["LONG_-\nMAX"]
end
o2["opt<int>:"]:1
block:group3:5
o2n00["∅"]
o2v00["0"]
o2v01["1"]
o2v02["..."]
o2v03["INT_-\nMAX"]
space
space
space
end
style o1 fill:#fff0,stroke:#fff0
style o2 fill:#fff0,stroke:#fff0
style o1v2 fill:#fff0,stroke:#fff0
style o1v6 fill:#fff0,stroke:#fff0
style o2v02 fill:#fff0,stroke:#fff0
Due to the extra “none” value of the optional<int>, neither type is a
superset of the other 3, therefore neither
should implicitly convert into the other! This comparison should conservatively
fail to compile. I guess one could try to implement a comparison operator that
does the equivalent of converting both types to optional<long>, but this
probably isn’t worth the trouble.
The next example is f6():
bool f6(optional<optional<int>> a, optional<int> b) {
return a == b;
}
What should f6(nullopt, nullopt) return? The result must be consistent with
the behavior of the constructors. In the following visualization, the arrows
symbolize the behavior of the constructor (optional<optional<int>>’s “value”
constructor), and the red boxes mark the nullopt state of both types.
block-beta
columns 6
o1["opt<int>:"]:1
block:group2:5
space
o1n0["∅"]
o1v0["0"]
o1v1["1"]
o1v2["2"]
o1v3["..."]
o1v4["INT_-\nMAX-1"]
o1v5["INT_-\nMAX"]
end
o2["opt<opt<int>>:"]:1
block:group3:5
o2n01["∅<sub>1</sub>"]
o2n00["∅<sub>0</sub>"]
o2v00["0"]
o2v01["1"]
o2v02["2"]
o2v03["..."]
o2v04["INT_-\nMAX-1"]
o2v05["INT_-\nMAX"]
end
style o1n0 fill:#FAA0A0
style o2n01 fill:#FAA0A0
o1n0-->o2n00
o1v0-->o2v00
o1v1-->o2v01
o1v2-->o2v02
o1v3-->o2v03
o1v4-->o2v04
o1v5-->o2v05
style o1 fill:#fff0,stroke:#fff0
style o2 fill:#fff0,stroke:#fff0
style o1v3 fill:#fff0,stroke:#fff0
style o2v03 fill:#fff0,stroke:#fff0
So the result of the comparison must be false! Only if we had designed our
constructors some other way, the result may be different.
The standard again does the wrong thing here – by trying to optimize the
comparisons, they introduced two overloads of operator==, one for
optional<U> and one for other Us. C++’s overload resolution rules kick in,
doing the wrong thing. Here is an example of a very bad thing happening:
std::optional<int> b = {};
std::optional<std::optional<int>> a = b;
assert(a == b); // This will fire :(
For a regular type like
std::optional<std::optional<int>>, this is really not acceptable. It violates
axiom (1): T a = b; assert(a==b);.
Conclusion
- There are clear and consistent rules of how to best initialize and compare
optional-like types. - Sometimes, C++’s built-in overload resolution rules are not your friend.
- When designing value types, draw diagrams to visualize the desired conversion rules, and “program” them yourself if needed.
I’m curious what you think about those rules. Do they make sense to you? Can you still find cases that behave “unexpectedly”?