ISO/IEC JTC1 SC22 WG21 TBD = TBD - 2017-02-06
Lorand Szollosi, szollosi.lorand@gmail.com or lorro@lorro.hu
Statement expressions were introduced in GCC 3 and quickly implemented - to various extent - by
Clang, IBM, Intel, Sun and Open64 and other compilers. A statement expression is a sequence of
(zero or more) statements followed by an expression, placed between ({
and })
,
which yields the same type as the last expression, and might appear anywhere where an expression is
allowed. Furthermore, parametric statement expressions are proposed to replace the usual
macro-based usage pattern.
Part of the proposal is to allow using the control flow / continuation management functions of the
evaluating context. This means allowing to return
from a SE that's inside a function;
allowing break
inside loops and switch statements; continue
inside loops; also,
the proposed coroutine keywords once accepted. Furthermore, an optional part is to allow for
named / parametric statement expressions (NSE/PSE), which can be thought of function-like and
lambda-like SEs, respectively.
A simplified syntax is proposed that makes it convenient to pass a locally described PSE to another
NSE/PSE along with other parameter(s). This moves proposals like for...else
,
for..if(break)
et al. from language scope to library scope.,
if(auto&& x : opt)for..if(!break)
, for..if constexpr (first)
,
While not strictly a part of the proposal, the idea of inline exception handling is also described
(throw inline
, catch inline
). This is to contrast inline handling to
capturing control / continuation switch in statement expression.
Originally, statement expressions were implemented to provide a way for defining variables within macros.
Had this been the only use case, one would recommend using template functions. However, several other
use cases were discovered, mainly due to the fact that flow control statements are available inside
statement expressions. This allows for safe custom control structures in library (vs. language), e.g:
(note that a
is not necessarily bool
, we only assume castable-to-bool.
#define return_if_false(a) ({ auto tmp = (a); if (!tmp) return std::move(tmp); std::move(tmp); })
Another use case is complex variable initialization. This is now usually done with lambdas:
std::vector<Handler> handlers = [&]{
std::map<HandlerEnum, Handler::CPtr> e2handler{ { EBlueHandler, new BlueHandler (...) },
{ EGreenHandler, new GreenHandler(...) },
{ ERedHandler, new RedHandler (...) },
{ EPinkHandler, new PinkHandler (...) } };
assert(e2handler.size() == NHandlers);
return e2handler | map_values;
};
The problem with this is two-fold:
()
above?)std::vector<Handler> handlers = ({
std::map<HandlerEnum, Handler::CPtr> e2handler{ { EBlueHandler, new BlueHandler (...) },
{ EGreenHandler, new GreenHandler(...) },
{ ERedHandler, new RedHandler (...) },
{ EPinkHandler, new PinkHandler (...) } };
assert(e2handler.size() == NHandlers);
produce e2handler | map_values;
});
Statement expressions have an easy learning curve in the beginning, up to nested expressions, which require careful analysis w.r.t. destruction order and flow control.
Since statement expressions are already widespreaded, this document serves rather as a classification of supported feature sets and a set of proposed macros to detect those. In the second part, parametric statement expressions are proposed to overcome the limitation of macros.
Statement expression support is a new language feature from the language's p.o.v. Accepting it means changing the parser to accept one more expression variant:
({
statement-seqopt expression; })
Statement expression proposal should be viewed in relation to other, related proposals, including, but not limited to:
Several implementations of various subsets of proposed functionality already exists. Therefore, the common subset was taken as a basis, while compiler-specific and proposed (i.e., currently not existing) features listed as options. This allows for staged acceptance: a possible outcome is to accept a subset of the features proposed and a subset of the feature discovery macros for optional features, which allows compiler implementors to converge with the latter. Features are discussed in the next chapter.
To better understant what SEs / PSEs / NSEs do, we first need to discuss the currently
available and proposed control / continuation switching constructs. The table below summarizes
these.
Construct | Classification | Where | Argument |
---|---|---|---|
return | stmt | in function / lambda | one, defining return type if auto , matching it otherwise |
break | stmt | inside for , while , switch ; in SE / PSE inside the above | none |
continuek | stmt | inside for , while , switch ; in SE / PSE inside the above | none |
throw | expr | anywhere | none or one, defining the type of exception to be thrown |
produce | stmt | proposed: in SE / PSE / NSE | one, defining the evaluation type of SE |
co_yield | stmt | proposed: in coroutine, in SE / PSE inside coroutine | one, defining return type if auto , matching it otherwise |
co_return | stmt | proposed: in coroutine, in SE / PSE inside coroutine | one, defining return type if auto , matching it otherwise |
throw inline | expr | proposed: in locations where the corresponding catch exists and is trivially deducible in compile-time (i.e., the list of calls from the try block to the throw inline are all inlines residing in the same compilation unit) | none or one, defining the type of exception to be thrown |
Furthermore, there exist support for some exotic features in comilers which is not proposed to be
included in the standard. These include goto
and switch
into, out of and
across multiple different statement expressions.
A note on produce
: current implementations do not have a keyword for this, but instead
evaluate to the last expression. This allows evaluation only in one point, which is inconvenient.
A keyword is suggested as that allows for multiple evaluation points. When a code uses produce
only as the last expressions, older compilers might use a #define produce
to become
compatible. Optionally, the last expression as result rule might be allowed to remain compatible
with old codes. Note that this is only necessary when there's no produce
in the SE.
expression
should allow for ({ statement-seq (opt) expression; })
.
An implementation might place further requirements on statement-seq
that are discussed
below.
template-parameter
should allow for break
, continue
,
return
(and from the continuations proposal, co_return
and co_yield
).
__cplusplus
) that is supported. This allows feature detection and workarounds if a given
feature is not supported in a given compiler.
In common subset, no macro proposed
Support for statement expressions that don't require any of the additional features listed in the table above. The statement sequence is evaluated before the last expression; then the value of the statement expression is that of the last expression. Example: (for illustration purposes only, not the recommended approach to product calculation)
int product = ({
int j = 1;
for (auto i : v)
j *= i;
produce j;
});
return
, break
and continue
in SEProposed macro: __cplusplus_se_ret
Support for basic control transfer that leaves the statement expression. Example:
#define return_if_false(a) ({ auto tmp = (a); if (!tmp) return tmp; produce tmp; })
#define accumulate_range_se(range, init, op) \
({ \
auto result = (init); \
for (auto elem : (range)) \
result = op(result, elem); \
produce result; \
})
#define product_op(lhs, rhs) \
({ \
if (rhs == 0) return 0; \
produce lhs * rhs; \
})
int product_of_vector(const std::vector<int>& v)
{
return accumulate_range_se(v, 1, product_op);
}
Proposed macro: __cplusplus_se_param
Support for parametric statement expressions. This is not yet a supported feature (at the time of writing)
by any compiler. Parametric statement expression support tries to solve the problems that arise due to
the heavy use of macros when implementing patterns via SE. A parametric statement expression is proposed
to be similar to a lambda in the capture and arguments parts, but has the body as a statement
expression. Such an expression can use the flow control statements of the creating context. It is
probably desirable to fix the capture list of the parametric SE as [&]
. When calling the
parametric statement expression, the calling code can offer its control / continuation switching
constructs in template parameter list (e.g. int life = return_if_false
).
Such passing of a construct is not considered as a (numbered) template argument.
It is a compile-time error if the PSE needs a construct from the caller that is not offered; it is allowed
to offer constructs that are not needed in the PSE. Note that for a given set of argument types,
both the return
and produce
type of a PSE are fixed at definition, not at calling.
This is to simplify return type deduction of the caller of the PSE. It is a compile-time error to call
a PSE that needs return
from the caller and the return types of the PSE and the caller are
incompatible. PSEs must be defined in the compilation unit where called and are inline candidates.
Example:
template<typename R>
int product_of_range(const R& r)
{
return std::accumulate(begin(r), end(r), 1, [&](int lhs, int rhs) ({
if (rhs == 0) return 0;
produce lhs * rhs;
}));
}
Proposed macro: __cplusplus_se_named
Support for named statement expressions. This is not yet a supported feature (at the time of writing) by any compiler.
Named SE is similar to a function in terms of (optional) template arguments, return type and arguments; but it has the
body as a statement expression. Predeclaration of a named SE, if allowed, would be identical to the corresponding
function / template function / member function / member template function, but the predeclaration must include the
control / continuation switching construct(s) needed from the caller. The latter is not necessary in the definition
(or if the definition is the declaration). The named SE must be defined in the same compilation unit where
called. Caller must offer control / continuation switching statements that the NSE can use. When evaluated, the named
SE might use the flow control statements of the caller; this means in particular that it returns to the caller via the
produce
expression, not by return
statement. The latter returns to the caller's caller.
Example:
template<break, typename T>
T return_if_zero(T t);
template<typename T>
T return_if_zero(T t)
({
auto tmp = std::move(t);
if (!tmp) return 0;
produce tmp;
})
template<typename R>
int product_of_range(const R& range)
{
int result = 1;
for (int elem : range)
result *= return_if_zero(elem);
return result;
}
No macro proposed. This feature is described here for completeness, it is a separately proposed feature.
It is proposed to allow explicitly stating that a throw
should be inline
candidate. Similarly, a catch
might be specified to be inline
,
thus only catching inline candidate exceptions. Even template catch
can
be allowed for inline exceptions. Inline candidate here is similar to inline candidate
functions: the complete call path from the try
-block to the throw
must be in the same compilation unit, no virtual calls, no function pointer calls allowed.
If these are satisfied, an optimizing compiler should be able to compile a code which,
from performance perspective, is similar to a goto
(and manual destructors)
on the exception path.
The most common use case is to throw inline
within the same function (as a
replacement for catch(break)
. A function that might be left via an inline
exception is called leaking the exception. It is suggested to disallow taking the
address of such a function (otherwise a shadow type system would be introduced). A SE / PSE / NSE
might leak exceptions.
Example:
struct Break { elem_t elem_ };
try {
for (auto&& elem : range) {
if (fn(elem)) throw inline Break{ elem };
process(elem);
}
} catch inline (const Break& brk) {
return brk.elem_;
}
postprocess();
return std::nullopt;
One possible way to implement NSE / PSE support is via regular SE and inline exceptions.
Consider a function call that offers all the control flow statements discussed. It can be
rewritten as a SE as shown in the table below.
NSE | inline exceptions and SE |
---|---|
|
|
One can observe that, while rewriting as inline exceptions is possible (and might be the underlying implementation), it's definitely verbose. Named SE provides a convenient syntax.
No macro proposed. This feature is described here for completeness, it is a separately proposed feature.
Many proposals, including the for...else
, for...break
, if (auto&& x : opt)
rely on a syntax similar to range-based for
loops. These proposals could be moved from language
to library if we extended the grammar to support in-place PSE declarations. Consider a for (auto&& x : r) { ... }
loop: it can be viewed as an in-place void PSE that takes x
as the parameter (of type deduced from
*begin_expr(r)
) and a repeated evaluation over the range by the for
statement.
Allowing this syntax for a user-defined NSE (of arbitrary return type) in place of for
is sufficient
to provide a close approximation of the above-defined features.
Thus the proposed feature is to make nse(auto&& x : v) { /* ... */ }
equivalent to
nse<return, break..., continue...>([&](auto&& x) ({ /* ... */ }), v)
(see table).
Current workaround of missing feature | With PSE / NSE, without syntax sugar | With syntax sugar |
---|---|---|
|
| |
|
|
|
In common subset, no macro proposed.
Support for POD class definitions inside statement expressions.
Example: (given key_t
and value_t
are PODs)
#define try_find(haystack, needle, key, value) \
({ \
using key_type = decltype(haystack.find(needle)->first ); \
using value_type = decltype(haystack.find(needle)->second); \
auto it = haystack.find(needle); \
struct result_t { \
key_t key; \
value_t value; \
}; \
std::optional<result_t> result; \
if (it != haystack.end()) { \
result = result_t{ it->first, it->second }; \
} \
produce result; \
})
Proposed macro: __cplusplus_se_move
The last expression is not copied when the expression is evaluated.
Without copy elision | With copy elision |
---|---|
|
|
Proposed macro: __cplusplus_se_destructor
Support for automatic storage duration variables with non-trivial destructor inside statement expressions. It is suggested to define destructor execution order exactly as if the statement expression's body were a function, with the last expression being a return statement. This feature also means that proper move support is provided. Example:
#define get_or_create(T, ...) \
({ \
std::unique_ptr<T> p(try_get<T>()); \
if (!p) \
p = new T(__VA_ARGS__); \
produce p; \
})
Proposed macro: __cplusplus_se_static
Support for dynamic initializers of local static variables in statement expressions. Example:
#define global_countdown() \
({ \
static int countdown = get_countdown_max_from_config(); \
produce --countdown;
})
Proposed macro: __cplusplus_se_class
Support for non-POD class definitions in statement expressions. Example: same as POD class definition in SE, but with non-PODs.
throw
and catch
inside SEProposed macro: __cplusplus_se_except
Support for exception handling inside statement expressions, including exceptions leaving the SE and exceptions caught inside SE that are thrown either inside or come from a called source. Example:
#define call_or_default(expr, def) \
({ \
decltype(expr) result; \
try { result = expr; } \
catch(...) { result = def; } \
produce result; \
})
Thanks for the support from ISO C++ Standard - Future Proposals Group:
Thanks for the support from co-workers on earlier, related proposal versions:
TBD
TBD
Older version(s) of this document: