Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The types, their constructors and pattern matching, as a grammar #4152

Open
uben0 opened this issue May 16, 2024 · 2 comments
Open

The types, their constructors and pattern matching, as a grammar #4152

uben0 opened this issue May 16, 2024 · 2 comments
Labels
proposal You think that something is a good idea or should work differently. scripting About Typst's coding capabilities

Comments

@uben0
Copy link

uben0 commented May 16, 2024

Description

My suggestion is not a concrete feature request, but a reflection that -- I hope -- will help in designing the upcoming type system.

Let list a bit what we have so far:

  • builtin types and their constructors
#heading(level: 2)[Hello World]
  • structured values from builtin types
#cetz.canvas({
  import cetz.draw: *

  rect((x: 1, y: 2), (rel: (x: -1, y: -1)), name: "a")

  content((rel: "a.north", to: (y: 0.2))[Hello World]
})
  • assignment with de-structuring
#let (a, b, c) = my-list

The beautiful expressiveness of grammars

What if primitive types (numbers, booleans, strings, arrays, ...) are terminal symbols and declared types are non-terminal symbols of a grammar?

heading -> number content   { level, body }
content -> number
content -> string
content -> heading

When a new type is declared, new rules (constructors) are added to the grammar:

position -> number number          { x, y }
position -> number number number   { x, y, z }
position -> string                 { name }
position -> string position        { name, relative }

Why?

If custom types are available, if they can have constructors, if pattern matching is available like in Rust, I am just stating the obvious here. So what does viewing types as grammar symbols bring to the table?

Use Case

Implicit type cast

In many cases, we could use a type's method without calling its constructor. This is possible if their are no ambiguities and if the type can be deduced from the context.

If refutable pattern matching becomes a thing, similar to what we have in Rust, it could become quite annoying to deal with combinations of typed and untyped structured values. An attempt to cast structured values when matching them against a typed pattern would solve this.

Good error messages

When parsing inputs against a grammar, it is obvious where it fails, though many compilers are so bad at explaining it.

When failing to match a structured value against a type, the error would be emitted by the compiler, and all the annoying pattern recognition and error message production, required to be implemented in order to have nice packages, with a nice API and a nice error reporting, would not be required any more. Types would rules, while maintaining the language nicely dynamically typed.

Algebraic data types

Having this grammar based representations of types would allow values to be super flexible and to be like multi-typed. For instance, an array should be an arguments and a dictionary should be as well. But the down cast should be postponed as much as possible, while being available as soon as it can.

Ambiguities related to dynamic typing

The way ambiguities emerges and are dealt with in grammars, is well known and properly documented. It is possible to come up with a rigorous and systematic way to solve (choose) or reject ambiguities. Again, resulting in possibly super nice error messages (because, in a grammar, it is clear what is in conflict).

Functions API and documentations

Currently, what functions accept as input is checked in an imperative manner by the callee side. The documentation about what is a valid input, and what isn't, is fully on the responsibility of the package author. It seams to be the inevitable downside (but flexibility) of dynamic typing. Having pattern matching, as a first class citizen, augmented with the type grammar as I explained, could solve this.

@uben0 uben0 added the feature request New feature or request label May 16, 2024
@uben0 uben0 changed the title Types and their constructors, plus pattern matching, as grammars The types, their constructors and pattern matching, as a grammar May 16, 2024
@laurmaedje laurmaedje added scripting About Typst's coding capabilities proposal You think that something is a good idea or should work differently. and removed feature request New feature or request labels May 16, 2024
@laurmaedje
Copy link
Member

I'm not yet sure what concretely you are proposing user-facingly, i.e. as the concrete type syntax and behaviour. I think a more "guide-level" view of how you're envisioning the type system to work would be easier to discuss than the more theoretical angle.

@uben0
Copy link
Author

uben0 commented May 17, 2024

Oh I see. Yes, I remained very abstract on purpose, but I do admit, some concrete examples would help to understand.

Once new types can be defined, we could imagine Cetz to have a position type, that would have many constructors as there are way to specify a position in Cetz.

Having a type for a position could be very powerful, and allow things like having a method to interpolate positions, and much more.

But the shape functions would probably still accept structured data.

#cetz.canvas({
  import cetz.draw: *
  import cetz: position

  rect(position(0, 0), position(1, 1))

  rect((0, 0), (1, 1))
})

Now, let's imagine there is a type cast syntax, like foo as bar, where it would just be a syntax sugar for bar(.. foo), with a type check to shortcircuit if foo is already of type bar.

We could image inside the rect function from Cetz, the pos1 and pos2 args being casted to position.

#let rect(pos1, pos2, .. blahblah) = {
  let pos1 = pos1 as position
  let pos2 = pos2 as position
}

Now that this basic concept is accepted, the fun starts when we imagine types with typed constructor with possibly other user defined types, with possibly recursive definitions. For example, what if Cetz define a constructor for position as a linear interpolation of two positions (which is already a feature).

#let position(pos1, ratio, pos2) = { /* blahblah */ }

Currently, all the pattern recognition is done by hand. The "shape" of the arguments is "parsed" explicitly. One way to ease this checking process, would be to have a match statement. This would help a lot, but what if a position, for whatever reason, can be constructed from a datetime, and that datetime also have a lots of different constructors. We would have to first construct the datetime object, and the position constructor would see it has been given a value of type datetime.

My suggestion is to make those constructor calls implicit and refutable when matching a pattern.

#match args {
  (pos1 as position, ratio as float, pos2 as position) => /* blahblah */,
  ...
}

And a constructor definition would be a pattern by itself, allowing overloading like in Haskell. In fact, if a grammar representation is used internally, types should provide typed constructors, otherwise, they wont benefit from it.

#type position {
  constructor (x as float, y as float) { /* blahblah */ }
  constructor (pos1 as position, ratio as float, pos2 as position) { /* blahblah */ }
  ...
}

But I originally did not give this use case, because I don't want to give a precise incarnation of the idea. What I just explain could be an eventual consequence of considering types like symbols of a grammar, their constructors like the rules of the grammar, and pattern matching like parsing.

To really understand the benefit, we have to understand how to (not) automate some process by providing (bad) syntax sugars or (not) handy utility functions. As Typst should remain a minimalist language, it should not introduce a full fledge type system like Rust, it would be absurd. We don't want to provide a lot of utility functions either, that would clutter the code base and would increase the learning curve.

I believe, this grammar approach could dictate how type casting and pattern matching would behave, and would remove the need for explicit checking inside functions, and explicit type construction when custom types will be a thing. It would automate a lot while not introducing anything special (except a match statement).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal You think that something is a good idea or should work differently. scripting About Typst's coding capabilities
Projects
None yet
Development

No branches or pull requests

2 participants