Note: This file also exists in Markdown format in this repo, generated at an unspecified point in time with `pandoc -f org -t gfm -o nml.md nml.org`. # Introduction This document describes my current knowledge of NeuroML2 and in a way aims to replace its missing developer documentation. It is likely flawed and constantly evolving. However, this text is needed as the actual projects' documentation is sparse and sometimes confusing and, most importantly, thin on the high concepts. NeuroML2 is an effort to provide portable descriptions of neuro-scientific simulations, including networks, and morphologically detailled neurons. NeuroML2 supersedes NeuroML1 and is not a simple iteration. Therefore, NeuroML will be used synonymously with NeuroML2 and abbreviated as NML or NML2. Despite its flaws, NeuroML2 is a valiant effort and an important feature to support. # NeuroML2 and LEMS **TL;DR**: NeuroML2 is a library of definitions written in LEMS, which is written in XML. All NeuroML2 is LEMS. Here I lay out the overall look and feel of NeuroML2, links to in-depth sections follow. NeuroML2 is – in contrast to v1 – implemented in LEMS[^1], modelling language for chemical, physical, and biological processes. Furthermore, NML2 has a kind of symbiotic relationship with LEMS in the large, which will be explored in depth later. Both NML2 and LEMS are written in XML, making the human interaction slightly awkward. Schemata (XSD) exist in the relevant git repositories. These form the basis of our exploration of NML2. The fundamental relationship between LEMS and NML2 is this: every functional item in NML2 has a matching definition in LEMS. So, to interpret a NML2 file, we need to look up these definitions in the corresponding LEMS files. An example might look like this - LEMS definition ``` xml ``` - NML2 file ``` xml ``` The tag `` introduces a new type, which can be instanced by using `` or the shorthand ``, the latter being the idiomatic use, especially in NML2. For brevity, I skipped the definition of units and dimensions. Because of this model, no discussion (or implementation) can proceed without constantly switching between LEMS and NML2. Both LEMS and NML2 follow an inheritance model akin to object-oriented programming. This means - A `ComponentType` may specify its 'base class' using `extends="base"`. - Whenever a value of type `base` is required, a derived type may be substituted. - *Thankfully* we only have to deal with single inheritance Again an example - LEMS definition ``` xml ``` - NML2 ``` xml ``` There is also the concept of public and private data members. In particular access is mediated by 'exposures', which declare externally readable members. LEMS: ``` xml ``` Here we also introduced the tags `Parameter` (externally settable value) and `DerivedVariable` (dependent quantity). `Parameter` values are set using XML attributes. The internal variable `q10_fix` is only readable through the exposure `q10`, even in enclosing scopes. NML2: ``` xml ``` Again we encounter two new bits of related functionality: paths and select. The rules are as follows: If a `ComponentType` declares a `Child` of type `type` with an exposed value `name`, that value can be used via `select` by addressing the *path* `type/name`. `Child` implies a single instance of type `type` to be used. However, multiple instances can be part of a `ComponentType`, which is declared using `Children`. Addressing values from children is done slightly differently. A *single* child's value is addressed using `select="child_id/name"`. It is possible to accumulate over children using `select="children[*]/name reduce="multiply"` or `select="children[*]/name reduce="add"`. More complicated selection is possible, see later. Example, assume the definitions from above being available LEMS: ``` xml ``` NML2: ``` xml ``` Here, `prod_q10=64` and `first_q10=1`, although is general there is no way of knowing `fixed_1` being present, unless enforced otherwise. # Inheritance As alluded to above, the LEMS/NML2 model for describing relations between components is similar to the 'is-a' type inheritance model used in Python or C++. To illustrate, we use the HH model from the NML2 'stdlib found in `Channels.xml` ``` xml ``` Note the following consequences of the 'is-a' model - derived items have access to their bases' `Parameter` values. - types can have a `Requirement` on the presence of certain variable in the surrounding scope. - `HHExpRate` can be inserted instead of a `baseHHRate` **or** `baseVoltageDepRate`. **QUESTION** from the examples and discussions I have the impression that having a `Dynamics` item in both base and derived items will cause the item from base to overwritten. Is this correct and where is it documented? **ANSWER** Correct, but nowhere written. # Recap So Far As we have learned in the sections above, NML2 is *written in* LEMS, where the salient definitions can be found in the LEMS files provided within the NeuroML2 repositories. This means each XML tag in an NML document, eg ``, can be looked up in the corresponding LEMS file, in this case `Channels.xml`. In addition, both LEMS and NML2 have – minimal, ie a working NML2/LEMS document will likely validate, but not every validating document is working NML2/LEMS – XSD schemata attached.[^2] To compose a concrete component from an NML2 document we will be required to - parse the related LEMS files - build an inheritance tree (since requirements are in terms of 'base classes') - extract dynamics and other items from the LEMS descriptions - obtain parameters and similar from the NML2 XML node - recursively instantiate children of the object From there, we can interpret the structure and integrate the equations of motion or produce an optimised represe first and interpret the result or export it to another format like native code or NMODL. # A Larger Example We are almost in a position to decode our first practical example ``` xml ``` To do so, we have to find the LEMS definitions, of which will we use simplified versions. First, we take a look at the ion channel. ``` xml ``` Recall the definitions for the `rate` hierarchy above. Finally, we need to inspect the `gates` ``` xml ``` Finally, we can put this together to extract the meaning of `gateHHrates` where *a* and *b* are defined by the `forwardRate` and `reverseRate` items. Now, we can compose this into `ionChannelHH` while simplifying the equations and renaming quantities NB. while we *are* using the same symbols here, *r*_*e**x**p* signifies two independent quantities in the equations for *m* and *h*. As notational convenience, we used greek letters for parameters. Unsurprisingly, we find the HH neuron model here. # How to Interpret NML2 In the last part we saw that *intuitively* NML2/LEMS is quite straightforward to interpret. However, we will need to formalise the process to be able to deal with NML2 in general. Starting with this section, the document will delve into the technical details. Our current goal will be to write a tool able to translate at least the `ionChannelHH` model into a symbolic description. Although NML2 can be extended using LEMS, we will use the static schema provided in the NML2 repository as is. Here is a list of invariants we need to uphold - Each type must be aware of its base type, eg `gateHHrates <- gate`, required to - check and sort `Children`, eg given `` we need to collect everything deriving from `gate` into an array `gates`. - compose the derived items parameters and variables from the inherited items - we cannot shorten `A <- B <- C` to `A <- C` since `C` might be used as either - Each item must be aware of its enclosing items ``` example NaConductance - gates +- m +- forwardRate | +- reverseRate | +- h +- forwardRate +- reverseRate ``` This is needed to facilitate path-based selection. We need to situationally account for these cases - `A` declares a `` ``` xml ``` or ``` xml ``` **QUESTION** is this acceptable as well? ``` xml ``` - `A` declares a `` ``` xml ``` The name `children` now relates to a collection of things deriving from `B`, ie `\forall T \sup B: String -> T`, akin to dynamic polymorphism/existentials. **NOTE** the dichotomy between `` and `` for `Child` and `Children` is a bit annoying. The possible instantiations of `ionChannel` are `baseIonChannel` the base class `ionChannel` / `ionChannelHH` the are identical, but both present for convenience `ionChannelKS` ion channel with a `gate` based on a kinetic scheme `ionChannelVShift` ion channel with a voltage offset. Practically, we only expect to support `ionChannel` for now and voltage shifted channels being a trivial extension later. Kinetic schemata are out of scope and `baseIonChannel` is of no practical use. However, it infeasible to generate a fixed set of prebuilt implementations as `ionChannel` instances can have an arbitrary number of `gate` instances which implement most of the functionality. This invalidates previous plans for implementation following that strategy. **QUESTION** Is the `id` field always allowed? **ANSWER** Yes, `id` is implied. ## Scoping - derived variables: - local: visible - enclosing: exposure - enclosed: exposure - state variables: - local: visible - enclosing: exposure - enclosed: exposure - parameters - local: visible - enclosing: private - enclosed: private - constants: - local: visible - enclosing: private - enclosed: private ## Flattening Instances In order to produce efficient representations, we will need to collapse instances and types into a flat format, eg ``` xml ``` should be flattened into ``` xml ``` # Synapses Next, we consider synapses, which have a different hierarchy, which is reproduced in the following, again, simplified for ease of reading [^3] ``` xml ``` There's some additions to our knowledge of NML2 here: `Property` Paraphrasing the LEMS documentation: 'like `Parameter`, but may be different for each instance'. So far, we did not encounter the `Instance` concept in LEMS and as it is used here for the synapse weight, we are going to largely ignore it. [^4] `OnEvent` Contains assignments to state variables, similar to `OnStart`, but now triggered upon incoming events, ie spikes. May choose a port. Finally, we need to discuss `EventPorts`. In contrast to simulators like Arbor and Neuron NML2 can differentiate between sources of incoming events and selectively emit events to channels. In pratice, however, only three such ports seem to be used in NML2 [^5]+"' \| sort -u\`\] in Receive spikes relay Forward spikes to sub-components spike Emit spikes ### Implementation Details ports Of these we only need to support `in` and `relay`, the former is covered, the latter currently not. weight As Arbor uses this construction for NMODL input ``` example net_receive(weight) { ... } ``` we need to use our prior knowledge about the semantics of the parameter and hard-code it into our NMODL translator. vpeer Similar to `weight`, sometimes needed for `gradedSynapse`, can be modelled in Arbor's NMODL dialect. # Gap Junctions Gap junctions are similar to synapses and have the following hierarchy \ \ \ \ \ \ \ \ \ \ \ The issue here with our scheme is that neither `peer` nor `vpeer` exist in our implementation of NML2. However, Arbor's NMODL dialect exposes `v_peer` as a global property. # NMODL Export While we have hinted at NMODL and Arbor a few times so far, our implementation of NML2 has been almost completely generic. To export to NMODL though, we need to bridge a significant gap. After we have processed the instantiations into either an instance or a collapsed instance, we edit the resulting model based on special cases. At the moment, let us focus on current based models only, ie all our instances need to produce an ionic transmembrane current, dubbed `iX` in NMODL. Later, we might extend this to concentration models. For synapses and gap junctions this is defined directly. Ion channels produce only a conductance value, `g`, so we add a derived variable `iX =g (v - eX)` where `X` is replaced by the ionic species. Based on the respective sections on implementation details, we edit the syntax trees to eliminate explicit definitions of `vpeer` and similar values, replacing them with Arbor's built-in variables. We then build individual blocks of the NMODL language from the AST by inspecting dependencies of the state variables and building the appropriate chains of expressions. All intermediates are stored as `LOCAL` to minimise memory accesses. The style defined by NML2, especially the # Kinetic Schemes Kinetic schemes describe ion channels by a set of discrete states and transition probabilities between those states. Again we reproduce a simplified version of the `ionChannelKS` hierarchy. ``` xml ``` Apart from the type of `gates` this is functionally identical to `ionChannelHH`. Now we delve into the definitions of `gateKS` ``` xml ``` Again, not much is new here, apart from `KineticScheme`, a list of states is given in `states` and their transition rates in `transitions`. Next, we look into the description of these transitons ``` xml ``` Again not much of a surprise here, but we do need to figure out what `Link` means. The final component is the linked state ``` xml ``` Now, let us consider a simple example from the NML2 sources. ``` xml ``` we can interpret this as - define two populations transitioning between states open `o1` and closed `c1` - call these fractions `o1_occupancy` and `c1_occupancy` - conserve `o1_occupancy + c1_occupancy = 1` for all times - transition `c1 -> o1` with rate `ft/rate/r` - transition `o1 -> c1` with rate `rt/rate/r` - calculate `ft/rate/r` as prescribed by `HHExpLinearRate` `rt/rate/r` as prescribed by `HHExpRate` In NMODL we would like to generate a `KINETIC` block like this ``` example KINETIC scheme { : ... snip ... gates_n_states_o1_to_c1 = gates_n_transitions_ft_rr + gates_n_transitions_rt_rr gates_n_states_c1_to_o1 = gates_n_transitions_ft_rf + gates_n_transitions_rt_rf gates_m_states_o1_to_c1 = gates_m_transitions_ft_rr + gates_m_transitions_rt_rr gates_m_states_c1_to_o1 = gates_m_transitions_ft_rf + gates_m_transitions_rt_rf ~ gates_n_states_o1_occupancy <-> gates_n_states_c1_occupancy (gates_n_states_o1_to_c1, gates_n_states_c1_to_o1) ~ gates_m_states_o1_occupancy <-> gates_m_states_c1_occupancy (gates_m_states_o1_to_c1, gates_m_states_c1_to_o1) } ``` where multiple rates between the same two states were merged. Note that - `gates_n_transitions_ft_rr = 0` - `gates_n_transitions_ft_rf = 0` - `gates_m_transitions_ft_rr = 0` - `gates_m_transitions_ft_rf = 0` ## On `KineticScheme` `KineticScheme` seems to break the abstractions put in place. Looking at `nodes` and `edges` which must be of kind `children`, otherwise the semantics would not work out. Similarly, both must be located directly under the scheme in the hierarchy. However, a `select` statement would have expressed the same thing more idiomatically. # The Final Building Block So far, we have dealt exclusively with channels, but NML2 requires some more levels above this for completely specilying the dynamics Cell defines dynamics of membrane potential in terms of currents (mediated by channels) BioPhys Properties specifies capacitance, resistivities. Membrane Properties initial potential, channels. Channel Density gives the area density of a channel. In order to interact with cable models like Arbor or Neuron, we need to pull these layers into the channel descriptions – partially at least. In the end we would like to automatically compose Arbor simulations from NML2 files, but for now, we have to extract those values manually. Working our way up ``` xml ``` As noted under *NMODL Export* we compute `iX = g(E - U)` where `g` is the `ionChannel`'s conductance, see above `g = fopen * conductance`. The parameter `condDensity` added in `channelDensityCond` is identical to `conductance`. Thus, we need to retain `conductance` and set it to the value of `condDensity`. # Much Time Has Passed # Networks Networks are important for us for both actual network support and simulation inputs. We note the following about the network layout in NML2. Instances for input and synapse appear at top-level, next to cells and networks. Thus any `.nml` file is eligible to define them. Networks are defined in terms of populations and projections. ## Populations Populations can be written in two ways inside a `network` `` `n` duplicates of `X`; using `MultiInstantiate` which seems to have no particular definition. `` give children as list of `` with a 3D position, all have type `X`. For us, there seems to be no practical difference between them. In both cases `X` names a `ComponentType` deriving `baseCell`. ## Projections Projections specify pre- and post-synaptic populations and a connection type, eg a synapse. Then, a list of connections is given as tuples `(pre: location, post: location)` where `location` takes the form `(cell: id, segment: id, fraction: [0, 1])` Target cells are selected by the relative paths and an index, eg `target=../pop0[0]` addresses the population `pop0`, one tag up from here, and picks cell zero from there. If `pop0` has a single member, the index is optional. Alternatively it seems (?) we can address using `pop//Cell` ### Questions - Does `[ix]` query by id or offset? - What's the difference between `pop/id/Cell` and `/pop[0]`? - Why do connections have to traverse the tree and do not work relative to their pre/post populations? Seems counterintuitive. - How does Arbor translate segments into its internal rep? Can we directly use it like `branch`? ### Example ``` xml ``` ## Inputs - inputs are similar to (one-sided) connections - targets are written as - id, default = 0 - fraction, default = 0.5 - stimuli derive `basePointCurrent` - pulse, sine, cosine - relevant tags: `explicitInput` `inputList` ### Example ``` xml ``` [^1]: Low-entropy modelling language. Using the moniker 'low-entropy' for an XML based format must be an attempt at humor. [^2]: For example the use of `sequence` in XSD implies order, but I am pretty sure this is not upheld everywhere. Also, most often `sequence` is used together with `count=unbounded`, which is semantically incorrect in many cases [^3]: Note the comment, which we ignore is if it was fixed and fix in our own code. [^4]: Arbor adds this on the library side to connections. [^5]: Confirmed via \`rg -Io 'port="\[^"