--- feature: overriding start-date: 2021-03-18 author: Yann Hamdaoui --- # Overriding This document is a proposal for an overriding mechanism in Nickel. It is expected to evolve while we put these ideas in practice, but shall serve as a design and implementation baseline. Related issues: [#103](https://github.com/tweag/nickel/issues/103), [#240](https://github.com/tweag/nickel/issues/240), [#255](https://github.com/tweag/nickel/issues/255), [#279](https://github.com/tweag/nickel/issues/279). ## Context ### Definition of the problem *Overriding* is the ability to take an existing configuration - in our case, a record - and update a specific part of it, such as a field or the field of a nested record. A naive and unergonomic of way of doing it is to repeat all the original values in a new record: ```nickel let record = { a = 1, b = "str", c = false } in { a = record.a, b = record.b, c = true } ``` This quickly becomes unmanageable for large records, which are common in Nix and Nixpkgs. In functional languages, this is usually better done using what's called functional update. Functional update is an operation that takes the original record, one or several fields to update, the corresponding new value, and returns a new updated record. It has the same semantics as our first snippet, but doesn't require to rewrite all unchanged fields. It can have a builtin syntax, such as OCaml's `with`: `{record with field = new_value}`, Haskell's `record {field = newValue}`, Nix `//` operator `record // {field = newValue}`, or Rust's syntax `RecordDataType {field: new_value, ..record}`. There are more advanced programming techniques that make updating deeply nested records ergonomic such as [Lenses](https://www.fpcomplete.com/haskell/tutorial/lens/) in Haskell, but these rely too heavily on advanced language and typing features to be practical in Nickel. In Nickel, we can already do a functional update using merging, although the updated field must have been marked as `default`: ```nickel let record = { a = 1, b = "str", c | default = false, } in record & {c = true} ``` As explained in the next section though, this is not satisfying. ### Overriding recursive records Nickel's records are different from the ones of OCaml, Haskell or Rust. They are lazy and recursive by default. They are thus better understood as [codata](https://link.springer.com/chapter/10.1007/978-3-030-17184-1_5) rather than data. Take the following example: ```nickel let record = { protocol | default = `Http, port | default = protocol |> match { `Http => 80, `Ftp => 21, _ => 8181, }, } in record & { protocol = `Ftp } ``` Here, our final record will be ``{protocol = `Ftp, port = 80}``. `port` is an expression depending on `protocol`, but it is not updated in the new version, which is something one could intuitively expect from a recursive record. Such recursive update is a pattern actively used in Nix, and is probably a common scenario for configurations in general: if I patch a base configuration by changing an option, I usually want other values depending on this option to be updated as well. ## Comparing override mechanisms We are going to review various overriding mechanisms of Nix and other related languages. Let us sketch some general traits of overriding mechanisms as a framework for comparison: - **(ERG)**: Ergonomy. A mechanism is ergonomic if it avoids complex encodings for overridable records, if it doesn't require extra library functions, and so on. It would ideally work out of the box with the native records of the language. - **(NEST)**: Nested field overriding. The ease of overriding nested fields, such as the value of `bar` in `{foo = {bar = "baz"}}`. Ideally, overriding nested fields is no harder than top-level fields. - **(COMP)**: Composability. Overriding is composable when one can seamlessly apply several overrides to an initial record. - **(EXPR)**: Expressivity. An expressive mechanism is one that allow to express and do more things. For example, it can be the ability to drop some fields, to access the previous record's version of the data (`super`), and so on. ## Overriding in Nix Nix expressions don't have a built-in way of overriding recursive records while automatically updating the dependent recursive fields either. Since the feature is still actively needed, it is provided by mechanisms implemented in library code. Keeping a language lean is usually a good design guideline: to provide an expressive yet simple set of features upon which others can be built as libraries. However, in the case of Nix, overriding is a fundamental pattern, and having it implemented in user code leads to some general well-known issues: - ~~**(ERG)**~~ Definition, update or field access is not as ergonomic as with vanilla native records. - Several competing mechanisms have been implemented in slightly different contexts. Overriding in general is already hard for beginners, but having to learn several ones, all similar but still different, is even harder. - User-land implementations don't have a first-class access to information such as source location and metavalues. In a dynamic language, this can make good error reporting much harder. - It is potentially harder to make user land implementations efficient. See [this gist](https://gist.github.com/edolstra/29ce9d8ea399b703a7023073b0dbc00d) for more details. We continue with an overview of existing mechanisms in Nix and related languages. ### Nixpkgs overriding The basic idea is to represent a recursive record explicitly as a function from a `self` record to the final record (here, in Nix syntax): ```nix r = rec { a = 1; b = a + 1; } # r is represented as: rRepr = self: { a = 1; b = self.a + 1; } ``` `self` is a self-reference, akin to `this` in object oriented languages. It is computed as a [fixpoint](https://en.wikipedia.org/wiki/Fixed-point_combinator), simply realized by auto-application in Nix, thanks to laziness: ```nix r = rRepr r ``` The explicit dependency to `self` gives the ability to provide a different definition for `self.a` in the final value. Overriding is achieved exactly by functionally updating the `a` field of the `self` parameter before passing it to the original representation: ```nix let extension = {a = 2;}; in # The fixpoint of result is { a = 2; b = 3; } resultRepr = self: (rRepr (self // extension)) // extension ``` The second outer update ensures that the final result is also set to `a = 2`, and not only the `a` appearing in `b`. Some details are left out, but this is the gist of it. See also [the Nix pill on overriding](https://nixos.org/guides/nix-pills/override-design-pattern.html) or [this article on fixpoints in Nix](http://r6.ca/blog/20140422T142911Z.html). #### Limits - ~~**(ERG)**~~ Use a specific representation, rather than handling good old plain records (although the actual representation in Nixpkgs is more ergonomic than a plain function). - ~~**(NEST)**~~ Overriding nested attribute sets is painful. If one do the naive thing, the whole subrecord is erased: ```nix let rRepr = self: { a = {b = self.a.c;}; }; in let extension = {a = {c = 2;};}; in rExt = let fixpoint = rRepr (fixpoint // extension); in fixpoint // extension # Gives {a = {c = 2;};} instead of expected {a = {b = 2; c = 2;};} ``` ### Nixpkgs overlays [Overlays](https://nixos.wiki/wiki/Overlays) can be seen as a sequence of transformations from a base record, each layer having access to a `super` reference to the previous layer and the `self` reference to the final value. Take two consecutive transformations: 1. Set the `a` field to `1` 2. Set the `b` field to `1`, add `1` to the `a` field ```nix let baseRepr = self: {c = self.a + self.b;}; in let overlay1 = self: super: {a = 1;}; in let overlay2 = self: super: {b = 1; a = super.a + 1;}; in let applyOverlays = self: let base = baseRepr self; in let first = base // overlay1 self base; in let second = first // overlay2 self first; in second; in let fixpoint = applyOverlays fixpoint; in fixpoint ``` In practice, the `super // ..` and fixpoints parts can be factorised in dedicated helper functions. #### Advantages - **(EXPR)**: The explicit representation of layers by `super` and `self` gives a large control to the user. - **(COMP)**: Composition is first-class. #### Limits - ~~**(ERG)**~~ Users still need to manipulate representations. - **order-dependency**: The result is order-dependent. Applying both `self: super: {a = super.a + 1}` and `self: super: {a = super.a / 2}` can give two different results depending on which one is the first layer. This also means that overrides must be grouped by layers, which is not necessarily the most logical structure. - ~~**(NEST)**~~ Overriding nested fields is still clumsy. For example, to override `lib.firefoxVersion`: ```nix self: super: { lib = (super.lib or {}) // { firefoxVersion = ...; }; } ``` ### NixOs module system The NixOS module system takes a different approach. There, all the basic blocks — the modules — are merged together in an unspecified order. What's deciding the priority of one option over the other are attributes that are explicitly stated in the modules themselves. The declaration of options (type, priority, etc.) and their assignment must be made separately. Compared to overlays, the explicit reference to `super` has disappeared. This makes it look like modules can only override one value with another, instead of combining different pieces of data together. This apparent shortcoming is solved by custom merge functions, that can redefine how to combine different values of the field of a configuration. By default, when merging two list values, the module system only knows how to replace one value with the other, because there's no canonical and commutative way of merging two lists. However, the user can specify that these lists should in fact be concatenated, resulting in the possibility of defining a list of paths by pieces: ```nix # Some module { paths = mkOption { type = types.listOf types.path; }; } # Config [..] someModule.paths = [foo/bar]; # Other config [..] someModule.paths = [/bar/baz]; # Resulting config: [foo/bar /bar/baz] ``` #### Advantages - **order-independence**: as opposed to overlays, modules can be merged in any order, and the final result will be the same. #### Limits - ~~**(EXPR)**~~: Despite custom merge functions, this model is less powerful than overlays which have access to an explicit reference to `super`. Modules, on the other hand, are all merged at the same level in an unspecified order. For example, the following has no equivalent in a merging model: ```nix let overlay = self: super: {b = self.c + 1; a = super.c + 1;}; in #... ``` However, modules are quite capable in practice. Removing fields and being order-dependent as in the overlay example above is not necessarily a good idea, and this limitation of expressivity may in fact be a good thing. ## Overriding elsewhere ### CUE CUE allows a form of late-binding for recursive attributes: ```console $ cat test.cue fields: { a: int b: a + 1 } & { a: 2 } $cue eval test.cue fields: { a: 2 b: 3 } ``` Combined with [default values](https://cuelang.org/docs/tutorials/tour/types/defaults/), this provides an overriding mechanism: ```console $ cat test2.cue fields: { a: int | *1 b: a + 1 } & { a: 2 } $cue eval test2.cue fields: { a: 2 b: 3 } ``` The basics of this overriding mechanism are close in spirit to what we want to achieve in this proposal. This is similar to Nickel's current `default` behavior, but with the desired late-bound merging. The expressivity is however too limited to replace e.g. the NixOS module system. #### Advantages - **(ERG)**: Overriding works seamlessly with native records. Note however that it requires fields to be explicitly marked as overridable (default). #### Limits - ~~**(EXP)**~~: One can only replace a value by another one. It is not possible to add one to the previous value, for example. - ~~**(COMP)**~~: Overriding a record a second time is not possible. ### Jsonnet Jsonnet refers to its overriding mechanism as *inheritance*, implemented by the `+` operator on objects: ```jsonnet local obj = { name: "Alice", greeting: "Hello, " + self.name, }; [ obj, obj + { name: "Bob" }, obj + { greeting: super.greeting + "!"}, obj + { name: "Bob", greeting: super.greeting + "!"}, ] ``` This is similar to the Nix operator `//`, but doing recursive overriding in the expected way out of the box. The extension can access the previous version in the same way as Nixpkgs overlays, using the `super` keyword. #### Advantages - **(ERG)**: Integrated with native records. - **(COMP)**: Overriding is easily iterated. - **(EXP)**: Achieve the same level of expressivity than overlays with the special keywords `self` and `super`. #### Limits - **order-dependency**: As with overlays, the order of application of overrides matters. ## Taking a step back While not exactly the same, all these overriding mechanisms are based on the same underlying principles: 1. Represent recursive records (explicitly or implicitly) as a function of `self` (and `super` in some cases) 2. Compute the combination as a **fixpoint** of several recursive records, giving back a standard record. As hinted by Jsonnet's terminology, the semantics of recursive records strikingly resemble the semantics of objects and classes in OOP. Replace records with objects, fields with methods and overriding with inheritance. This is not so surprising: there's actually an history of encoding objects in functional languages as recursive records (see the introduction of [The Recursive Record Semantics of Objects Revisited](https://hal.inria.fr/inria-00072423) for a good overview) going back to 1988 \[[Cardelli](http://lucacardelli.name/papers/inheritance.pdf)\]. This is also mentioned in the [README](https://github.com/MuKnIO/nixpkgs/blob/devel/lib/pop.md#some-historical-context) of POP (an object system in Nix), where the author observes that overriding mechanisms in Nix (and Jsonnet for that matter) are a simplified lazy object system (simplified because objects lack proper state and there is no distinction between classes and instances). Their logical conclusion is to embrace this fact and design a proper object system helped by existing literature, rather than reinventing the wheel. Similarly, Nix [overlays](https://nixos.wiki/wiki/Overlays) can be seen as a single inheritance mechanism. Inheritance-based overriding imposes an order on the overrides. A single level inheritance is usually fine, but a complex hierarchy can become hard to maintain and to reason about. The NixOS module system is designed differently. It is based on merging: the configuration is created by combining a set of unordered records following specific rules. Of course, there's still a need for ordering information somewhere, but it is rather expressed as priorities. This system has the advantage of making merge commutative (in contrast with inheritance or the `//` operator), as in CUE, and to untie data definition from precedence specification: one can define a module where each field has a different priority, if it makes sense to group them logically. With inheritance, values are required to be grouped in layers of the same priority, instead of logically. Summing up the differences between inheritance-based mechanisms and merge-based mechanism: **order-dependency**: Inheritance is order dependent: the chain of extensions must be defined with the precise order they will be applied in mind, and definitions must be grouped in consequence. On the other hand, merging is commutative, and the precedence information is encoded as priorities. Thus, overriding by merging can be defined using stand-alone pieces of data, although the behavior of priorities is not local. **(EXP)**: Merged records have only access to the final computed fixpoint `self`, while objects have access to the previous stage of extension via `super`. However, as in the NixOS module system, it is possible to address this issue using custom merge functions. This is a bit less expressive, but in a good way: it forces the merge strategy to be uniform along each field, while mechanisms like overlays or inheritance can do pretty much anything. ## Proposal overview We basically propose to adopt the same kind of merge, priority, and custom merge function-based scheme as the NixOS module system. One big difference with the NixOS module system is that it would be built-in in the language, being usable for any native record whatsoever, and having good interaction with the rest of the language features. The implementation would have first class access to locations, to the AST, the memory layout, the metadata, and so on. We first describe the ideas at a high-level. Then, we review the issue and challenges of this approach, and propose concrete solutions to overcome or mitigate them. Finally, the precise operational semantics is laid out in a dedicated section. ### Recursive records & merging **This section defines the semantics, rather than an actual efficient implementation**. As before, it is useful to see recursive records represented as functions - or constructors - that take a `self` parameter and return a final non recursive record, in the same way as the original overriding mechanism of Nixpkgs (the non-existing syntax `def := value` is used to insist on the fact that we are defining new objects): ```text r = { a = 1; b = a + 1; } # Definition of the representation of r repr(r) := fun self => { a = 1; b = self.a + 1; } ``` Field access amounts to compute a fix-point: ```text repr(r).foo := let fix = repr(r) fix in fix.foo ``` Merging simply merges the underlying representations: ```text repr(r1) & repr(r2) := fun self => r1 self & r2 self ``` ### Priorities Merge is fundamentally commutative. The problem is, not all fields should be treated the same: some are default values that ought to be overridden, some are high-priority values that ought to override. The solution is to use a priority system, encoding the precedence of each value while retaining commutativity (only the priority annotation counts, not the order in which values are merged). Priorities have drawbacks. One is non-locality: the final result depends on other priorities of the field's values you are being merged with, potentially defined elsewhere. However, this can be mitigated by good defaults and an adapted set of priorities, such as a bottom `default`, a top `force`, and a infinite range of integers priorities in between, the default priority (when no priority is provided) being `0`. Doing so, it's easy to just use `default` or `force` to override or be overridden by anything, without knowing the precise integer priority. Integer priorities still gives freedom with an infinite supply of levels if required. ### Custom merge functions Custom merge functions would be specified by a `merge` metavalue attribute. In order to enforce commutativity, they would receive their arguments in an indistinguishable order, such as having the type `{lower: Dyn, higher: Dyn, priority: } -> Dyn`. If both have the same priority, the order is not specified, and may even be randomized by the interpreter. The `priority` field indicates when it is the case, if this case needs special handling. ```nickel let mergeLists : forall a. {lower: List a, higher: List a, priority: } -> List a = fun args => args.lower @ args.higher in let Contract = { path | List Str | merge mergeLists | doc "A list of paths to search in." } in let block1 | #Contract = { path = ["/usr/local/bin"] } in let block2 = { path = ["/bin"] } in # { path = ["usr/local/bin", "/bin"] } block1 & block2 ``` ## Issues & challenges The proposal raises some questions. While none seems insuperable, users will need to have a good mental picture of the system, and we should be careful to avoid ending up with a complicated system full of ad-hoc fixes to edge cases. ### Implementation Recomputing the fixpoint at each field access is wasteful, because recursive records as functions satisfy the following property: as long as it is not merged, the `self` argument is constant. Thus, we can memoize the fixpoint and only invalidate `self` on a merge. Also, fields that do not depend on a recursive variable can be hoisted out of the function, using a representation like: ```text repr(r) := { a = 1; b = fun self => self.a + 1; } ``` Concretely, each field of a record may either be a thunk as usual, in the case of non recursive expressions like `a`, or a thunk together with the original expression in the recursive case. All record operations except merge operate on the thunk. A merge operation, on the other hand, restores the original expression again. There's more potential optimizations, but this first step should be a reasonable trade-off between implementation complexity and performance. See [#103](https://github.com/tweag/nickel/issues/103) for more details. ### Scoping Should a record be able to access a yet undefined field because it is expected to be provided by a subsequent merge? Two possible approaches: - **dynamic scoping**: records can reference fields that are not explicitly defined locally, such as: ```nickel {a = b} & {b = 1} ``` Dynamic scoping have a number of issues, and is usually considered bad practice. - **lexical scoping**: as currently, require self-referenced fields to be defined locally. Note that thanks to contracts, one can require the existence of a field without defining it. For example, we could write the previous example as: ```nickel {a = b, b | Num} & {b = 1} ``` This is also a better practice to explicitly state the fields whose presence is assumed in general. This RFC proposes to adopt **lexical scoping**. We could have an even lighter syntax, such as `{a = b, b} & {b = 1}` for requiring the presence of a field `b`. ### Priorities #### Levels This RFC proposes to adopt the following ordered set of priorities: - `default` is the bottom element - Integers priorities in the middle - `force` is the top element That is, `Priorities := default \/ {n | n integer} \/ top` with `default <= ... <= -1 <= 0 <= 1 <= ... <= force`. If not specified, the normal priority (the default priority, no to be confused with the `default` priority) is `0`. This provides an infinite supply of priorities both below (`default \/ {n | n < 0}`) as well as above (`force \/ {n | n > 0}`). Integer priorities are specified using the `priority` keyword. Defining more than one priority in the same meta-value is an error. Example: ```nickel { foo | Num | default = 1, bar | Str, # equivalent to `bar | Str | priority 0` baz.boo.bor | priority -4 = "value", final | force = `CantOverrideMe, } ``` #### Recursive priorities As noted in [#240](https://github.com/tweag/nickel/issues/240), configurations should be easily overridable, and the approach outlined until now can end up annoyingly requiring configurations to be written with either `default` or `force` everywhere. This RFC proposes to add *recursive* (or "leafy", or "push down") priorities, as described in [#279](https://github.com/tweag/nickel/issues/279). We define the new meta-values `default rec` and `force rec`, whose semantics are defined as: - `eval(expr | default rec)`: case of `eval(expr)`: - `{field1 = value1, .., fieldn = valuen} | annots`: `{field1 = (value1 | default rec aux), .., fieldn = (valuen | default rec aux)} | annots` - `v | annots` if `v` is not a record: `v | defaulted(annots)` where `defaulted(annots)` is defined below. - `defaulted(annots)`: - if `annots` contains the priority metavalue `force`, then `defaulted(annots) := annots` - otherwise, let `annots' | prio` be the decomposition of `annots` into a priority `prio` (possibly empty) and the other metavalues, then `defaulted(annots) := annots' | default` That is, `default rec` recursively overwrites all the priorities of the leafs of a record to `default`, excepted for `force` that is left untouched (metavalues are written in a liberal way, in that they can be empty). `force rec` is defined similarly, excepted that it erases all priorites, even default ones. The names `default rec/force rec` are just suggestions. Example: ```nickel let neutralConf = { foo = 1, bar.baz = "stuff", bar.blorg = false, } let defaulted | default rec = neutralConf # ^ Will evaluate to: # { # foo | default = 1, # bar = { # baz | default = "stuff", # bar.blorg | default = false, # }, # } # This is different from `neutralConf | default`! The latter version # would be overrided at once, as illustrated below. defaulted & {bar.baz = "shapoinkl"} # ^ Gives the expected: # { # foo | default = 1, # bar = { # baz = "shapoinkl"; # bar.blor | default = false, # }, # } # While (neutralConf | default) & {bar.baz = "shapoinkl"} # ^ This gives only: # {bar.baz = "shapoinkl"} ``` This way, an existing definition (arbitrarily complex: that could be the root of all Nixpkgs) can easily (and lazily) be turned into a full overriding or overridable configuration. A possible extension is to have a user-provided function that is mapped on priotities: `default/force rec` are just special cases of this. ### Custom merge functions 2 In the case of the NixOS module system, all the configuration is merged in one final phase that conceptually evaluates the merge AST. In Nickel, merging can be done partially and observed by expressions. The natural semantics of custom merge functions should be to affect all subsequent merging, as for contracts application. Indeed, we want to specify a contract or a custom merge function once and for all, not to repeat it at every field definition: ```nickel let add = fun args => args.lower + args.higher in // {a = 3} {a | merge add = 1} & {a = 1} & {a = 1} ``` But what about `{a = 1} & {a | merge add = 1} & {a = 2}`? If a merge annotation only affects terms on the right hand side, this breaks commutativity. ```nickel let r1 = {a = 1} in let r2 = {a = 1} in let r3 = {a | merge add = 1} in # {val = 2} r1 & r2 & r3 # {val = 3} r3 & r1 & r2 ``` Possible solutions: 1. Make a `n-ary` merging behave differently than repeated merging, which would apply the custom merge function to all arguments. This doesn't solve the commutativity problem per se: we also have to decide that in ordinary merging, custom function wouldn't affect subsequent merges anymore. This does raise yet other problems, one being of a good distinctive syntax. More generally, it sounds confusing and ad-hoc. 2. What seems the right, but not trivial solution: symmetrize custom merging and make it "distribute backward" as well. That is (simplifying by forgetting about priority and `lower`/`higher` arguments): ```nickel val1 & val2 & (val3 | merge func) <=> (val1 | merge func) & val2 & val3 <=> func (func val1 val2) val3 // instead of the naive <=/=> func (val1 & val2) val3 ``` This RFC proposes solution 2. This raises some difficulties: - What if `val1 & val2` is in a thunk that happens to be forced in between? What should `let x = val1 & val2 in x & (val3 | merge func)` evaluate to, or `let x = val1 & val2 in builtins.seq x (x & (val3 | merge func))`? In some sense, we would like to have a call-by-name semantics rather than a call-by-need. - (a) In the case of record fields, overriding has to solve the same problem of remembering an original expression along an evaluated version. If we allow custom merge annotations only on record fields, it is possible to recover the original merge AST (abstract syntax tree) and interpret it with the custom merge function as wanted. This may prove a bit tricky to implement in practice, for the merge expression `val1 & val2` may be obfuscated by some program transformations or evaluation artifacts, but doable. - (b) A more extreme take is to generalize this to any term: a merge expression would behave like a lazy datatype `Merge(t1,t2)` with respect to evaluation. Evaluating it would amount to automatically apply `eval : Merge(t1,t2) -> t1 & t2` with a potential merge function, which wouldn't erase the original top-level thunk `Merge(t1,t2)`. This is in fact how default values are currently handled. Doing so, we don't need to restrict custom merge functions to record fields. This incurs an additional cost though for merging (remembering all the original ASTs of merge expressions), even when one doesn't actually use custom merge functions. We propose to implement directly the general solution of (b). ## Operational semantics This section defines the operational semantics of merging with the choices previously described. The choices made in [custom merge function section](#custom-merge-functions-2), in particular symmetrizing merge with respect to custom merge function annotations, require the evaluation process to: 1. Determine the abstract syntax tree (AST) of a merge expression 2. Decide the merge function to use 3. Interpret the tree with the given function Let us define the notion of a *merge tree*. The definition determines what is the area of influence of a merge function annotation, answering the following questions: ```nickel let add = fun x y => x + y in let var = 1 & 1 in var & (1 | merge add) # result? (var | merge add) & 1 # result? ((1 & 1) + (1 & 1)) & (1 | merge add) # result? ((1 & 1) + (1 & 1) | merge add) & 1 # result? # file: somefile.ncl 1 & 1 # file: other.ncl (import "somefile") & (1 | merge add) # result? (import "somefile" | merge add) & 1 # result? ``` In the following, we will write "meta"-code (think of the code of the Nickel interpreter) in ML pseudo-code. To distinguish this meta-code from Nickel expressions, we use the quote syntax `[| exp |]` to denote a Nickel expression. For example, `hasAnnot [| f (1 + 1) |] = false` means that the value of the `hasAnnot` (meta-)function on the Nickel expression `f (1 + 1)` is `false`. ### Merge tree A merge tree is a binary tree whose nodes are labelled by a `metadata`. Leafs are labelled with both a `metadata` and a Nickel expression `e` in WHNF (weak head normal form, that is, evaluated). In this context, a `metadata` is a (meta-)record which field names corresponds to meta attributes `custom,priority,default,contract` and so on. Fields are optional, excepted `priority`, which must always be defined. ```haskell data Metadata = Metadata { priority :: Priority, merge :: Option Priority, contracts :: Option (List Contract), // ... } data AbsMergeTree a = Merge a a | Exp Expression type MergeTree = MergeTree (AbsMergeTree (MergeTree, Metadata)) ``` The function `mergeTree: Expression -> MergeTree` compute the merge tree of an expression. It needs to evaluate leafs to see if they contains themselves merge expressions, such that `let x = a & b in x & c` and `a & b & c` has the same merge tree: that is, merge treee commute with evaluation. ```text # we maintain an environment of bindings in `env` metaData [| e | attr = val, ... |] ::= { attr = val, ... } if priority is set { priority = normal, attr = val, ...} otherwise mergeTree e ::= (absMergeTree e, metaData e) absMergeTree [| e1 & e2 |] = Merge(mergeTree(e1), mergeTree(e2)) // whnf = Weak head normal form, result of evaluation absMergeTree [| whnf |] @ e = Exp(e) // Should mergeTree cross import boundaries? Probably not absMergeTree [| import path |] @ e = Exp(e) // All other cases absMergeTree e = weakEval e // weakEval is defined exactly as standard evaluation, excepted that it stops at // merge expressions, as if they were a lazy datatype in weak head normal form weakEval [| e1 & e2 |] @ e = e // all other cases are defined exactly as for eval weakEval e = ... ``` ### Semantics Now that we defined merge trees, we need to extract a potential merge function from it: ```text type MergeFunction = {value: Dyn, priority: Priority} -> {value: Dyn, priority: Priority} -> Dyn extractMergeFuns : MergeTree -> List MergeFunction extractMergeFuns (Merge(ast1,_),Merge(ast2,_)) = extractMergeFuns ast1 @ extractMergeFuns ast2 extractMergeFuns (Leaf(e,meta)) = [f] if meta.merge == Some([| f |]) [] otherwise mergeFunction : MergeTree -> Result MergeFunction () mergeFunction t = let funs = extractMergeFuns t in if lists.length t == 0 then Ok(__builtinMerge) // the standard `&` merge function else if lists.length t == 1 then Ok(head t) else Err(()) ``` And the interpretation of a merge tree by a merge function: ```text // can be extended, as long as it is a partially ordered set Priority = Number | -inf | +inf | ... interpret (Merge(ast_1,meta_1),Merge(ast_2,meta_2)) f = f {value = interpret ast_i f, prio = meta_i.priority} {value = interpret ast_j f, prio = meta_j.priority} where i,j s.t meta_i.priority <= meta_j.priority ``` We can finally define the evaluation of merge: ```text eval [| e1 & e2 |] = let t = mergeTree [| e1 & e2 |] in interpret t (mergeFunction t) ``` Please keep in mind that `mergeTree (e1 & e2)` refers to the **original** merge tree of this expression. It needs to be accessible even after evaluation by the implementation. In practice, we can't update a thunk that contains `e1 & e2` with the result of the evaluation. This is already the case with `default` values currently (`default (1 + 1)` isn't updated to `2`, but to `default 2`, otherwise the semantics would change). A good view on this is that the semantics is inherently call-by-name, and that any caching mechanism (including call-by-need) is a practical semantics-preserving optimization.