Skip to content

Landscape Summary

Shayan Najd edited this page Jun 6, 2016 · 9 revisions

Three Trees

I have studied and compared the definition of expressions in the three ASTs, more closely. As expected, it seems like we cannot just pick one of the three ASTs. Good news is that the HSE AST (annotated one) and HsSyn AST are quite close.

TH AST seems to be less useful. It can be used as a measure on what metaprogramming facilities (e.g., reification machinery) should be provided for the unified AST.

HSE is the most popular and user friendly. It can be used as a guideline on how the unified AST should look like, and as a measure on what tools and features are actually useful in practice (important to avoid over-engineering).

The HsSyn is the most comprehensive, tested, and maintained. It can be used as measure on what information the unified AST is required to carry.

Jacques: the above is not an analysis, but a high level summary. What we need to see are the details of the differences, not your opinion of them. The devil is in the details here, and even a small detail might end up being a crucial difference that has a massive influence on the eventual design. And not only the ASTs themselves, but what the API (i.e. provided functions) for manipulating and traversing these ASTs matters greatly. data types are not defined solely by their data definition, but also, very strongly, by their folds. [Hey, that's why finally tagless works at all!!]

Shayan: Right. Above paragraph is supposed to be a high level summary, and the details of the differences (in terms of proposed changes to discuss) can be found in here.

Code for HsSyn AST and HSE AST

To study the differences between HSE AST and HsSyn, I have made a minimal, yet executable, definition of AST in HsSyn and HSE. Then, I have started to update HsSyn AST by a set of changes (that seemed reasonable to me) to make it closer to HSE (and hence, facilitate the comparison).

The (self-contained) code for HSE AST is avaialbe in here

The (semi self-contained) code for the updated HsSyn AST is available in here

The changes to HsSyn AST are reflected as an step-by-step series of commits. Besides, removing comments, functions, imports, record syntax, and derivings, and merging the files, the following are the key changes:

  1. removing "Hs" prefix
  2. renaming Expr to Exp
  3. two specific renames to avoid name clash caused by merging the files: (a) IPBinds --> IPBinds', and (b) KingSig --> KindSig'
  4. renaming Type from TyCoRep to TCRType to avoid clash with Type from HsSyn
  5. adding mock (empty) datatype declarations for the references to outside of HsSyn, to make the file executable.

Although may not be needed, due to above, the original HSE AST, referred to in the wiki can be found at https://github.com/haskell-suite/haskell-src-exts/blob/master/src/Language/Haskell/Exts/Annotated/Syntax.hs.

Jacques: renamings are not really key changes. If you had used qualified names, which ones would be needed?

Shayan: I have tried to stay as close as possible to the original code in both cases (HSE, and HsSyn). Above changes are there only to make the comparison easy (and the code readable, self-contained, and executable). About use of qualified types, we already have a heading here to discuss it.

Jacques: What is really missing are translation functions (in all directions) and explanations for what directions are (currently) not possible. Just defining a data-structure tells us very little.

Shayan: In theory, it would be great to have such translation functions. I am personally a fan of such comparison, but in practice, it seems to be infeasible provided the size of the datatypes and their implicit semantics (needed to see what is "not possible").

Course of Action

Generally speaking, we need to update HsSyn AST to be practically useful (rather than a presentation only suitable for GHC internal machinery) and replace TH with that. HSE provides such an AST, proven to be useful in practice. However, it may not be "large enough" to carry all the information required for the passes inside GHC's front-end. Good news is that HsSyn AST and HSE AST are already pretty similar. The key design step is to find an isomorphic (i.e., large enough) variant of HsSyn AST which is the closest to HSE AST. I have started to do so, by studying and comparing the two ASTs. The repository currently includes a minimal, yet executable, definition of expressions in HsSyn and HSE (the rest, like patterns, will follow soon), that I use for such study and comparison.

Clone this wiki locally