Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Variadics #2240

Open
wants to merge 59 commits into
base: trunk
Choose a base branch
from
Open

Conversation

geoffromer
Copy link
Contributor

@geoffromer geoffromer commented Sep 30, 2022

Proposes a set of core features for declaring and implementing generic variadic
functions.

A "pack expansion" is a syntactic unit beginning with ..., which is a kind of
compile-time loop over sequences called "packs". Packs are initialized and
referred to using "pack bindings", which are marked with the each keyword at
the point of declaration and the point of use.

The syntax and behavior of a pack expansion depends on its context, and in some
cases by a keyword following the ...:

  • In a tuple literal expression (such as a function call argument list), ...
    iteratively evaluates its operand expression, and treats the values as
    successive elements of the tuple.
  • ...and and ...or iteratively evaluate a boolean expression, combining
    the values using and and or, and ending the loop early if the underlying
    operator short-circuits.
  • In a statement context, ... iteratively executes a statement.
  • In a tuple literal pattern (such as a function parameter list), ...
    iteratively matches the elements of the scrutinee tuple. In conjunction with
    pack bindings, this enables functions to take an arbitrary number of
    arguments.

proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made another pass without diving into the type checking bits, since those seem like something that could be handled in a separate proposal once the other issues are handled.

proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
(Including comments from the 2023-08-31 open discussion)
@josh11b
Copy link
Contributor

josh11b commented Sep 13, 2023

Proposes a set of core features for declaring and implementing generic variadic functions. The central concept is a "pack", which is a value that consists of a sequence of other values. Almost all operations that can apply to single values are generalized to apply to packs as well, by applying the operation to each element of the pack. Packs are created from tuples or statically-sized arrays using the [:] operator, and can be converted back to tuples with the ..., operator. Variadic blocks, delimited with ...{ }, enable iterative computation on packs. [:] and ..., can also be used in patterns, which enables functions to take an arbitrary number of arguments.

Remember to update the PR description.

Copy link
Contributor Author

@geoffromer geoffromer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember to update the PR description.

Done. Thanks!

during typechecking. For example, in this code:

```carbon
fn F[each T:! type](x: (..., each i32), ..., each y: Optional(each T)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... is also used in a lot of non-pattern contexts where deduction doesn't happen, unlike auto and ;], which dilutes its value as a marker of deduction. Even in patterns it's not a fully reliable signal -- e.g. there's no arity deduction in a pattern like x: (... expand (i32, i32)) (which is silly but valid).

Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still really think that variadics.md should be inlined into p2240.md, but that can wait until people are done reviewing to avoid churn.

fn Zip[... each ElementType:! type]
(... each vector: Vector(each ElementType))
-> Vector((... each ElementType)) {
... var each iter: auto = each vector.Begin();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit weird that it is ... each in the function signature, but ... var each in the function body. Seems like var introduces a pattern context and so var ... each would be the way of declaring a pack variable. I guess that we need the ... before the var since we are also expanding the initializer (and could be expanding the variable type)? I guess that means var ... each foo: i32 = something; would never be legal?

Might be worth a comment pointing out the difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit weird that it is ... each in the function signature, but ... var each in the function body. Seems like var introduces a pattern context and so var ... each would be the way of declaring a pack variable. I guess that we need the ... before the var since we are also expanding the initializer (and could be expanding the variable type)?

Right, we don't want to repeat just the pattern, we want to repeat the entire statement.

I guess that means var ... each foo: i32 = something; would never be legal?

Yes, but for a different reason: it doesn't correspond to any of the valid pack expansion syntaxes listed below. I think something like this would be legal, though:

var (... each foo: i32) = (... each vector.Begin());

But that seems like just a more complicated version of what I have here.

Might be worth a comment pointing out the difference.

I think it would get unwieldy to try to comment this example to point out all the concepts it's illustrating, because this example is intentionally dense. The example is unavoidably going to be hard to follow on a first read, because I haven't explained how the features in the example work yet. I could put the example afterward, but I think the prose explanation would seem really abstract and hard to follow without an example to refer to.

In the longer run I think the way out of the dilemma is your suggestion of building up the exposition more gradually from a series of simpler examples, but we're deferring that for now.

docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
geoffromer and others added 3 commits September 13, 2023 16:01
Co-authored-by: josh11b <josh11b@users.noreply.github.com>
Copy link
Contributor

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only done a fairly high-level skim of the updated examples ond proposal. I can do a more detailed review if useful, but wanted to get a high level sense and feedback first.

Generally, I'm pretty excited about the direction here. I don't have any really major concerns.

One example raised a somewhat down-in-the-weeds question that I've left inline, but it's neither a blocking concern nor terribly important in the larger picture I think.

Comment on lines 160 to 162
// Note that this implementation is not recursive. We split the parameters into
// first and rest in order to forbid calling `Min` with no arguments.
fn Min[T:! Comparable & Value](first: T, ... each next: T) -> T {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should be a follow-up, but when reading this it felt somewhat unfortunate.

Both that you needed to factor out the first element and then needed to explain why you factored it out because it is somewhat surprising.

I wonder if there is a reasonable syntactic way to represent 0-or-more vs. 1-or-more?

(This isn't a blocking comment, just didn't want to forget about it. If no one has an immediate idea, maybe just worth recording it in a possible future work section?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's somewhat unfortunate, but as points in mitigation:

  • The comment is speaking more to the reader of the proposal (flagging a difference from C++), rather than the reader of the code in a real codebase.
  • If we wrote the function signature without a separate first, the first line would need to be something like var result: T = (... each arg)[0];, which doesn't express the intent nearly so clearly.

I don't have any good ideas for a syntax (maybe somehow base it on the + vs * distinction in regexes?), but that's not the only problem: We would also need to extend the typesystem to model and propagate minimum-arity constraints on segments. That seems doable, and could be desirable for other reasons as well, but it seems substantial enough to belong in a follow-up proposal.

Copy link
Contributor Author

@geoffromer geoffromer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chandlerc:

I've only done a fairly high-level skim of the updated examples ond proposal. I can do a more detailed review if useful, but wanted to get a high level sense and feedback first.

It's your call how in-depth to review this, but we should at least settle the terminology issue (especially whether I should go ahead with my suggested terminology changes).

Comment on lines 160 to 162
// Note that this implementation is not recursive. We split the parameters into
// first and rest in order to forbid calling `Min` with no arguments.
fn Min[T:! Comparable & Value](first: T, ... each next: T) -> T {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's somewhat unfortunate, but as points in mitigation:

  • The comment is speaking more to the reader of the proposal (flagging a difference from C++), rather than the reader of the code in a real codebase.
  • If we wrote the function signature without a separate first, the first line would need to be something like var result: T = (... each arg)[0];, which doesn't express the intent nearly so clearly.

I don't have any good ideas for a syntax (maybe somehow base it on the + vs * distinction in regexes?), but that's not the only problem: We would also need to extend the typesystem to model and propagate minimum-arity constraints on segments. That seems doable, and could be desirable for other reasons as well, but it seems substantial enough to belong in a follow-up proposal.

Comment on lines 364 to 370
scrutinee, and the M elements of the pattern after the `...,` expansion are
matched with the last M elements of the scrutinee. If the scrutinee does not
have at least N + M elements, the pattern does not match.

The remaining elements of the scrutinee are iteratively matched against
_operand_, in order. In each iteration, `$I` is equal to the index of the
scrutinee element being matched, minus N.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is phrased sounds like we first match the elements before the ...,, then the ones after it, then the ones in the middle. I find that a bit surprising -- if there are T total elements, I'd expect we would first match the first N elements in order, then the T - N - M elements in the middle in order, then the last M elements in order.

docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
contain at least one expansion site. All sites of a given expansion must have
the same arity (which we will also refer to as the arity of the expansion).

A pack expansion cannot occur within another pack expansion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for this? I don't think nested pack expansions have proven problematic in C++, but then again C++ doesn't do full type-checking prior to pack expansion, and doesn't have any "packs out of nowhere" syntax like expand tuple. (In contrast, packs-of-packs might present some implementation difficulties, and expand each blah would result in a pack-of-packs.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the "Allow nested pack expansions" section in p2240.md.

Also, nested pack expansions may be unproblematic as of C++20, but they seem to have been problematic for some pending C++ proposals. For example, the most recent (unpublished) draft of the expansion statements proposal removes support for expanding over packs, because it would lead to ambiguity in cases like this example (credited to one Richard Smith):

template<typename ...T>
void f(T ...v) {
  g([&](auto y) {
    template for (auto x : v) { /*...*/ } // Pack expansion or not?
  }(v)...);
}

Here an expansion statement is nested within an expansion expression, and it's unclear whether the "expansion site" v on the commented line belongs to the outer or inner expansion.

Comment on lines +109 to +112
`expand` _expression_", with the same precedence as `,`. However, that syntax is
not considered a pack expansion, and has its own semantics: _expression_ must
have a tuple type, and "`...` `expand` _expression_" evaluates _expression_ and
treats its elements as elements of the enclosing tuple literal. This is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this isn't necessarily quite the same thing as saying that this is restricted syntax that forms and immediately expands a pack. In particular:

fn F[...template each Tuple:! type](...each tuple: Tuple) {
  ... G(...expand each tuple);
}

... would mean F((a, b), (c, d, e)) succeeds and calls G(a, b) then G(c, d, e), whereas if ...expand formed and then expanded a pack, we'd reject due to the nested pack expansion.

This isn't an objection to treating ...expand as not being a pack thing, just an observation that there's something slightly unusual happening here, that may go a little beyond what we'd allow if it were a pack thing.

If we're OK with that example, I think we should be clearer whether G(... ...expand each tuple) is allowed, as well as things like H(...expand ...expand multi_level_tuple). The description of ...expand as having a particular precedence suggests this might be OK, but the above rule about these things appearing as tuple elements suggests that it might not be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'd prefer say that it's not a pack expansion because that makes it easier to explain the behavior of pack expansions (e.g. expansion sites always begin with each). But I think we should disallow those forms of nesting anyway, in order to preserve our ability to generalize expand later, so I've added that restriction.

docs/design/variadics.md Outdated Show resolved Hide resolved
would deduce inconsistent values for any deduced parameter. However, in general
this is intractable, because in the worst case the number of distinct ways to
map symbolic arguments to parameters is ${2n \choose n}$ for n variadic
arguments, which is only a factor of $\sqrt{n}$ away from exponential.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
arguments, which is only a factor of $\sqrt{n}$ away from exponential.
arguments, which is exponential.

a^n / sqrt(n) = O(b^n), for any 1 < b < a.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, cool! In that case I think I'd also drop the discussion of combinatorics. This is moot due to the overhaul of this whole section, but see commit 14b97b3 for how it will look if the bigger overhaul is reverted.

docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
arguments). Extending the type system to support deduction that splits into
multiple cases would add a fearsome amount of complexity to the type system.

#### Identifying potential matchings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a lot of complication, and I wonder whether it's justified. I'd be inclined to reject any case where we can't map each argument to exactly one parameter based on shape alone -- so singular parameters must have singular arguments, fixed-arity pack expansions must have matching fixed-arity arguments, and at most one variable-arity pack parameter can remain, which gets the same shape as the rest of the arguments.

Are there important use cases that don't work in that kind of approach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming you would allow multiple arguments to map to the single variadic parameter? This is needed to call a variadic with a non-variadic argument list.

One of the motivating use cases for the current approach was to support the case where functions requiring at least one argument, but the caller puts the extra argument at the end while the callee puts the extra parameter at the beginning (or the other way around).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on Discord and elsewhere, Min(... each x, y) is an example of something that wouldn't work with that approach, but your question led me to realize that the rules I had didn't support that case either. However, your approach can be made to work if we extend the type system and syntax to allow Min to be written with a single variadic parameter that's required to have at least one element, so that's roughly what I've done here (commit b16a5d9 if you want to see it in isolation from the other comment responses). I'll follow up with changes to p2240.md to discuss the tradeoffs of this approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now updated p2240.md, so this should now be fully ready for review.

Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry, I haven't searched for the right place to put these issue threads.)

ever be library types, they would certainly need to be defined variadically, so
in that sense this proposal may actually support that principle.

## Alternatives considered
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: restrictive matching of symbolic packs to parameter lists.

The more I think about this option, the more I dislike it:

  • It creates an unnecessary difference between the template and symbolic cases at the call boundary, opposite the direction of Checked generics calling templates #2153 .
  • It is an additional rule to explain to developers, and the rule needs to be pretty subtle in order to allow things like forwarding and destructuring.
  • I'm pretty worried that a consequence of this subtle rule is that we are going to reject things that developers will expect to work and it will be really hard to give an error explaining what they did wrong.
  • It means there is additional state associated with a function declaration around how the arguments are declared that needs to be modeled somehow. I'm still working out how much of an issue this is, but for example the things I'm thinking about are how this interacts with Functions, function types, and function calls #2875 and functions actually being implementations of the Call interface for a type.

Furthermore, I feel like the alternative is pretty good: there are cases that we have to reject, but in those cases there is a pretty clear error we can give ("no single return type could be deduced", or whatever).

in that sense this proposal may actually support that principle.

## Alternatives considered

Copy link
Contributor

@josh11b josh11b Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: each(>=1) syntax

This issue I'm more neutral on, but I think it is extra complexity we don't need if we don't have restrictive matching. It means that there is additional syntax to learn, more ways to write function signatures, and those ways are subtly different in ways that are hard to explain. It also has concerns around which each needs to be marked with (>=1) and extra round trips if the user gets it wrong.

computation adds some major challenges here as well. However, if tuples could
ever be library types, they would certainly need to be defined variadically, so
in that sense this proposal may actually support that principle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: performance of generated code.

The more I've thought about this, the more I think we should take this consideration off the table unless we have some evidence that one approach would be more efficient. My suspicion is that we can make a ABI/calling convention that is efficient for all of the different alternatives:

  • If the arguments do not have the same type, monomorphize.
  • Otherwise pass all the arguments in order contiguously on the stack along with the total number. This should allow a single implementation to work with a variety of arities, and doesn't care how you have written the parameter signature.

the form "`each` _identifier_".

By default, a pack binding pattern can match any number of times, but `each` can
optionally be followed by "`(>=` _integer-literal_ `)`", which constrains the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I was reading your nice proposal, I was thinking about (>= lakes the left operand. Is it possible to put >= next to ...?

...>=1 each param

In this way, it doesn't require parentheses, and the left operand of >= is .... Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's possible, and arguably it would be more accurate, because it actually does constrain the whole expansion. The main reason I attached it to each is that in that position, there's no ambiguity: each can only be followed by an identifier or an arity constraint, which are easy to distinguish. With ... it's more subtle, because ... can be followed by many things, including some operators. It can't otherwise be followed by >=, so it's not formally ambiguous, but it seems like it could easily lead to confusion for human readers, who might expect ...>= to be related to >= in the same way that ...and is related to and. Also, I'm already kind of uncomfortable with how we've given ... so many syntactic variants (like ...and, ...or, and ...expand), so I'm particularly reluctant to create yet another.

If we did attach it to the ..., I think I'd want to keep the parentheses, in order to emphasize that this is nothing like ...and and that the 1 is not part of the expansion body.

@msadeqhe
Copy link

msadeqhe commented Dec 14, 2023

What's your opinion if we could have generalized binary operator expansion in addition to and and or?

... and each item
... or each item
... + each item
... * each item
... << each item

...>=1 and each item
...>=1 or each item
...>=1 + each item
...>=1 * each item
...>=1 << each item

If >= has lower precedence than other operators, maybe it could be removed for consistency:

...1 and each item
...1 or each item
...1 + each item
...1 * each item
...1 << each item

Also the expand keyword puts comma between them, and , could mean expansion:

... , each item
...1 , each item

What's your opinion?

EDIT: Now I've read your alternative section about fold expressions. Thanks for the information.

Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active, please comment or remove the inactive label.

This PR is labeled inactive because the last activity was over 90 days ago. This PR will be closed and archived after 14 additional days without activity.

@github-actions github-actions bot added the inactive Issues and PRs which have been inactive for at least 90 days. label Mar 14, 2024
@josh11b josh11b added long term Issues expected to take over 90 days to resolve. and removed inactive Issues and PRs which have been inactive for at least 90 days. labels Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
long term Issues expected to take over 90 days to resolve. proposal rfc Proposal with request-for-comment sent out proposal A proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants