Like a normal language implementation, we have a reader that turns text into tokens, a parser that turns tokens into AST, a typechecker that validates an AST, a macro expander that transforms an AST with user-defined forms into an AST in a core language, and an evaluator that turns ASTs into behavior. Unlike a normal language implementation, the typechecker runs before the macro expander. This means that the typechecker has to operate on user-defined AST nodes, which is possible, because (I think) one can derive a typechecker for a form from its definition. The weirdness of this language implementation starts with the AST. Well, the parser and tokenizer are weird, but they don't have to be. TODO: a demo of the steps involved in typing and expanding an actual program would probably be more informative than the "Pipeline" diagram. Language definition: [Ast] ast.rs and ast_walk.rs The core language is small, but the AST definition is much smaller. It contains only: * variable introductions and references, * binding information, * (temporary parsing information), * quotation level indicators, and * `Node`, which represents complex AST nodes. `walk` traverses these nodes and keeps the environment (actually a stack of environments, for nested quotation) set up appropriately. A "positive" walk takes an environment of `Elt`, and results in a single `Elt`. A "negative" walk takes an `Elt`, and produces an environment of `Elt`. Evaluation or typechecking expressions is a positive walk; evaluating or typechecking patterns is negative. [Core] core_forms.rs, core_type_forms.rs, core_qq_forms.rs, core_macro_forms.rs The Turing-complete "core" language is Unseemly's user interface. There are separate files for defining the types themselves ("core_type_forms.rs") and for defining syntax (un)quotation ("core_qq_forms.rs"). Pipeline: ⋮ ⋮ ⋮ ↱ `Ast`→ [TypeSynth] →`Ok(_)` → [Expand] →`Ast`→ [Evaluate] →`Value`→ [Reflect] ↩ (phase 2) ↱ `Ast`→ [TypeSynth] →`Ok(_)` → [Expand] →`Ast`→ [Evaluate] →`Value`→ [Reflect] ↩ (phase 1) `&str`→ [Parse] →`Ast`→ [TypeSynth] →`Ok(_)` → [Expand] →`Ast`→ [Evaluate] →`Value` (runtime) [Parse] grammar.rs and earley.rs The parser has to parse arbitrary grammars (`FormPat`s). Furthermore, it must be able to extend the grammar it is currently parsing (it does this by extending `SynEnv`), and generate new `Form`s. Extending the grammar may entail invoking the rest of the pipeline (Typesynth and Evaluate), the timing of which should be controlled by phasing. The parser is scannerless, because we need to change tokenization when we change grammars sometimes. [TypeSynth] ty.rs Type synthesis. It should be possible to derive type synthesis for macros! Synthesize a type for the macro body, and it'll tell you what you get. I think. If not, there's not much point to making this language. [Expand] expand.rs Recursively expand macros. Because macros are identified by their syntax (by the parser), there's no environment. [Evaluate] eval.rs Evaluation is a tree traversal, surprisingly similar to type synthesis. Maybe at some point, this will be translation into some other language (like Ocaml? Rust? LLVM?), but it may be faster to interpret the many small programs that macros are. [↱ Reflect ↩] reify.rs Macros are programs that manipulate `Ast`s (they can also access environments). We need to reflect internal `Ast`s into Unseemly values and reify them back out. Fun fact: `Ast` transitively includes almost every type in the compiler. So we use a macro (see reification_macros.rs) to generate the reification/reflection, rather than do it by hand. Important data structures: `Name` name.rs Will contain hygiene metadata, eventually `TokenTree` read.rs The result of reading; a tree with nodes at every ()[]{} `FormPat` parse.rs Grammar for forms (core and user-defined) `Form` form.rs Everything one needs to know about a language form. Contains: * `grammar : FormPat` [Parse] -- grammar (how we know the user used the form) Also contains: * how things should bind * grammar "binding" information (syntax extension) * `synth_type : WalkRule` [TypeSynth] -- typechecking rule * `eval : WalkRule` [Evaluate] -- evaluation rule * `quasiquote : WalkRule` [Evaluate] -- trivial, except for unquotation * TODO [misc] -- pretty-printing guidance `Ast` ast.rs Syntax, in its logical structure. `Beta` beta.rs How terms bind. Surprisingly important! `SynEnv` parse.rs Indicates what the grammar is at a particular point. [Map from [named grammar nodes] to `FormPat`] `EnvMBE` util/mbe.rs Ergonomic (...in a sense) representation for the contents of `Ast` nodes which have parts that can repeat `n` times. Other imporant things: alpha.rs has α-equivalence utilities, used by `ast_walk.rs` Questions: What if a macro like Rust's `try!` has a type error in its return value? What happens? What should happen?