#+TITLE: NeuroML2 From Whole Cloth
Note: This file also exists in Markdown format in this repo, generated at an
unspecified point in time with =pandoc -f org -t gfm -o nml.md nml.org=.
* Introduction
This document describes my current knowledge of NeuroML2 and in a way aims to
replace its missing developer documentation. It is likely flawed and constantly
evolving. However, this text is needed as the actual projects' documentation is
sparse and sometimes confusing and, most importantly, thin on the high concepts.
NeuroML2 is an effort to provide portable descriptions of neuro-scientific
simulations, including networks, and morphologically detailled neurons. NeuroML2
supersedes NeuroML1 and is not a simple iteration. Therefore, NeuroML will be
used synonymously with NeuroML2 and abbreviated as NML or NML2.
Despite its flaws, NeuroML2 is a valiant effort and an important feature to
support.
* NeuroML2 and LEMS
*TL;DR*: NeuroML2 is a library of definitions written in LEMS, which is written
in XML. All NeuroML2 is LEMS.
Here I lay out the overall look and feel of NeuroML2, links to in-depth sections
follow.
NeuroML2 is -- in contrast to v1 -- implemented in LEMS[fn:: Low-entropy
modelling language. Using the moniker 'low-entropy' for an XML based format must
be an attempt at humor.], modelling language for chemical, physical, and
biological processes. Furthermore, NML2 has a kind of symbiotic relationship
with LEMS in the large, which will be explored in depth later.
Both NML2 and LEMS are written in XML, making the human interaction slightly
awkward. Schemata (XSD) exist in the relevant git repositories. These form the
basis of our exploration of NML2.
The fundamental relationship between LEMS and NML2 is this: every functional
item in NML2 has a matching definition in LEMS. So, to interpret a NML2 file, we
need to look up these definitions in the corresponding LEMS files. An example
might look like this
- LEMS definition
#+begin_src xml
#+end_src
- NML2 file
#+begin_src xml
#+end_src
The tag ~~ introduces a new type, which can be
instanced by using ~~ or the shorthand ~~, the latter being the idiomatic use, especially in NML2. For brevity,
I skipped the definition of units and dimensions. Because of this model, no
discussion (or implementation) can proceed without constantly switching between
LEMS and NML2.
Both LEMS and NML2 follow an inheritance model akin to object-oriented programming.
This means
- A ~ComponentType~ may specify its 'base class' using ~extends="base"~.
- Whenever a value of type ~base~ is required, a derived type may be substituted.
- /Thankfully/ we only have to deal with single inheritance
Again an example
- LEMS definition
#+begin_src xml
#+end_src
- NML2
#+begin_src xml
#+end_src
There is also the concept of public and private data members. In particular
access is mediated by 'exposures', which declare externally readable members.
LEMS:
#+begin_src xml
#+end_src
Here we also introduced the tags ~Parameter~ (externally settable value) and
~DerivedVariable~ (dependent quantity). ~Parameter~ values are set using XML
attributes. The internal variable ~q10_fix~ is only readable through the
exposure ~q10~, even in enclosing scopes.
NML2:
#+begin_src xml
#+end_src
Again we encounter two new bits of related functionality: paths and select. The
rules are as follows:
If a ~ComponentType~ declares a ~Child~ of type ~type~ with an exposed value
~name~, that value can be used via ~select~ by addressing the /path/
~type/name~. ~Child~ implies a single instance of type ~type~ to be used.
However, multiple instances can be part of a ~ComponentType~, which is declared
using ~Children~. Addressing values from children is done slightly differently.
A /single/ child's value is addressed using ~select="child_id/name"~. It is
possible to accumulate over children using ~select="children[*]/name
reduce="multiply"~ or ~select="children[*]/name reduce="add"~. More
complicated selection is possible, see later.
Example, assume the definitions from above being available
LEMS:
#+begin_src xml
#+end_src
NML2:
#+begin_src xml
#+end_src
Here, ~prod_q10=64~ and ~first_q10=1~, although is general there is no way of
knowing ~fixed_1~ being present, unless enforced otherwise.
* Inheritance
As alluded to above, the LEMS/NML2 model for describing relations between
components is similar to the 'is-a' type inheritance model used in Python or C++.
To illustrate, we use the HH model from the NML2 'stdlib found in ~Channels.xml~
#+begin_src xml
#+end_src
Note the following consequences of the 'is-a' model
- derived items have access to their bases' ~Parameter~ values.
- types can have a ~Requirement~ on the presence of certain variable in the surrounding scope.
- ~HHExpRate~ can be inserted instead of a ~baseHHRate~ *or* ~baseVoltageDepRate~.
*QUESTION* from the examples and discussions I have the impression that having a
~Dynamics~ item in both base and derived items will cause the item from base to
overwritten. Is this correct and where is it documented?
*ANSWER* Correct, but nowhere written.
* Recap So Far
As we have learned in the sections above, NML2 is /written in/ LEMS, where the
salient definitions can be found in the LEMS files provided within the NeuroML2
repositories. This means each XML tag in an NML document, eg ~~, can
be looked up in the corresponding LEMS file, in this case ~Channels.xml~.
In addition, both LEMS and NML2 have -- minimal, ie a working NML2/LEMS document
will likely validate, but not every validating document is working NML2/LEMS --
XSD schemata attached.[fn:: For example the use of ~sequence~ in XSD implies
order, but I am pretty sure this is not upheld everywhere. Also, most often
~sequence~ is used together with ~count=unbounded~, which is semantically
incorrect in many cases]
To compose a concrete component from an NML2 document we will be required to
- parse the related LEMS files
- build an inheritance tree (since requirements are in terms of 'base classes')
- extract dynamics and other items from the LEMS descriptions
- obtain parameters and similar from the NML2 XML node
- recursively instantiate children of the object
From there, we can interpret the structure and integrate the equations of motion
or produce an optimised represe first and interpret the result or export it to
another format like native code or NMODL.
* A Larger Example
We are almost in a position to decode our first practical example
#+begin_src xml
#+end_src
To do so, we have to find the LEMS definitions, of which will we use simplified versions.
First, we take a look at the ion channel.
#+begin_src xml
#+end_src
Recall the definitions for the ~rate~ hierarchy above. Finally, we need to inspect the ~gates~
#+begin_src xml
#+end_src
Finally, we can put this together to extract the meaning of ~gateHHrates~
#+begin_latex
q(t=0) &= \frac{a}{a + b}\\
q'(t) &= a - \frac{q}{a + b}
#+end_latex
where $a$ and $b$ are defined by the ~forwardRate~ and ~reverseRate~ items. Now,
we can compose this into ~ionChannelHH~ while simplifying the equations and
renaming quantities
#+begin_latex
m(t=0) &= \frac{r_{lin}}{r_{lin} + r_{exp}}\\
m'(t) &= r_{lin} - \frac{m}{r_{lin} + r_{exp}}\\
h(t=0) &= \frac{r_{exp}}{r_{exp} + r_{sig}}\\
h(t) &= r_{exp} - \frac{h}{r_{exp} + r_{sig}}\\
g &= h m^3 \gamma
#+end_latex
NB. while we /are/ using the same symbols here, $r_{exp}$ signifies two
independent quantities in the equations for $m$ and $h$.
#+begin_latex
r_{exp} &= \rho_{exp}\exp((U - \mu_{exp})/\sigma_{exp})\\\\
r_{sig} &= \frac{\rho_{sig}}{1 + \exp((\mu_{sig} - U)/\sigma_{sig})}\\
x_{lin} &= \frac{\mu_{lin} - U}{\sigma_{lin}} \\
r_{lin} &= \frac{\rho_{sig} x}{1 - \exp((- x)}}
#+end_latex
As notational convenience, we used greek letters for parameters. Unsurprisingly,
we find the HH neuron model here.
* How to Interpret NML2
In the last part we saw that /intuitively/ NML2/LEMS is quite straightforward to
interpret. However, we will need to formalise the process to be able to deal with
NML2 in general. Starting with this section, the document will delve into the
technical details. Our current goal will be to write a tool able to translate at
least the ~ionChannelHH~ model into a symbolic description. Although NML2 can be
extended using LEMS, we will use the static schema provided in the NML2 repository
as is.
Here is a list of invariants we need to uphold
- Each type must be aware of its base type, eg ~gateHHrates <- gate~, required
to
- check and sort ~Children~, eg given ~~
we need to collect everything deriving from ~gate~ into an array ~gates~.
- compose the derived items parameters and variables from the inherited items
- we cannot shorten ~A <- B <- C~ to ~A <- C~ since ~C~ might be used as either
- Each item must be aware of its enclosing items
#+begin_example
NaConductance - gates +- m +- forwardRate
| +- reverseRate
|
+- h +- forwardRate
+- reverseRate
#+end_example
This is needed to facilitate path-based selection.
We need to situationally account for these cases
- ~A~ declares a ~~
#+begin_src xml
#+end_src
or
#+begin_src xml
#+end_src
*QUESTION* is this acceptable as well?
#+begin_src xml
#+end_src
- ~A~ declares a ~~
#+begin_src xml
#+end_src
The name ~children~ now relates to a collection of things deriving from ~B~, ie
~\forall T \sup B: String -> T~, akin to dynamic polymorphism/existentials.
*NOTE* the dichotomy between ~~ and ~~ for ~Child~ and ~Children~
is a bit annoying.
The possible instantiations of ~ionChannel~ are
- ~baseIonChannel~ :: the base class
- ~ionChannel~ / ~ionChannelHH~ :: the are identical, but both present for convenience
- ~ionChannelKS~ :: ion channel with a ~gate~ based on a kinetic scheme
- ~ionChannelVShift~ :: ion channel with a voltage offset.
Practically, we only expect to support ~ionChannel~ for now and voltage
shifted channels being a trivial extension later. Kinetic schemata are out of
scope and ~baseIonChannel~ is of no practical use.
However, it infeasible to generate a fixed set of prebuilt implementations as
~ionChannel~ instances can have an arbitrary number of ~gate~ instances which
implement most of the functionality. This invalidates previous plans for
implementation following that strategy.
*QUESTION* Is the ~id~ field always allowed? *ANSWER* Yes, ~id~ is implied.
** Scoping
- derived variables:
- local: visible
- enclosing: exposure
- enclosed: exposure
- state variables:
- local: visible
- enclosing: exposure
- enclosed: exposure
- parameters
- local: visible
- enclosing: private
- enclosed: private
- constants:
- local: visible
- enclosing: private
- enclosed: private
** Flattening Instances
In order to produce efficient representations, we will need to collapse
instances and types into a flat format, eg
#+begin_src xml
#+end_src
should be flattened into
#+begin_src xml
#+end_src
* Synapses
Next, we consider synapses, which have a different hierarchy, which is
reproduced in the following, again, simplified for ease of reading [fn:: Note
the comment, which we ignore is if it was fixed and fix in our own code.]
#+begin_src xml
#+end_src
There's some additions to our knowledge of NML2 here:
- ~Property~ :: Paraphrasing the LEMS documentation: 'like ~Parameter~, but may
be different for each instance'. So far, we did not encounter the ~Instance~
concept in LEMS and as it is used here for the synapse weight, we are going to
largely ignore it. [fn:: Arbor adds this on the library side to connections.]
- ~OnEvent~ :: Contains assignments to state variables, similar to ~OnStart~,
but now triggered upon incoming events, ie spikes. May choose a port.
Finally, we need to discuss ~EventPorts~. In contrast to simulators like Arbor
and Neuron NML2 can differentiate between sources of incoming events and
selectively emit events to channels. In pratice, however, only three such ports
seem to be used in NML2 [fn:: Confirmed via `rg -Io 'port="[^"]+"' | sort -u`]
- in :: Receive spikes
- relay :: Forward spikes to sub-components
- spike :: Emit spikes
*** Implementation Details
- ports :: Of these we only need to support ~in~ and ~relay~, the former is
covered, the latter currently not.
- weight :: As Arbor uses this construction for NMODL input
#+begin_example
net_receive(weight) { ... }
#+end_example
we need to use our prior knowledge about the semantics of the parameter and
hard-code it into our NMODL translator.
- vpeer :: Similar to ~weight~, sometimes needed for ~gradedSynapse~, can be
modelled in Arbor's NMODL dialect.
* Gap Junctions
Gap junctions are similar to synapses and have the following hierarchy
#+begin_example xml
#+end_example
The issue here with our scheme is that neither ~peer~ nor ~vpeer~ exist in our
implementation of NML2. However, Arbor's NMODL dialect exposes ~v_peer~ as a
global property.
* NMODL Export
While we have hinted at NMODL and Arbor a few times so far, our implementation
of NML2 has been almost completely generic. To export to NMODL though, we need
to bridge a significant gap. After we have processed the instantiations into
either an instance or a collapsed instance, we edit the resulting model based on
special cases.
At the moment, let us focus on current based models only, ie all our instances
need to produce an ionic transmembrane current, dubbed ~iX~ in NMODL. Later, we
might extend this to concentration models. For synapses and gap junctions this
is defined directly. Ion channels produce only a conductance value, ~g~, so we
add a derived variable ~iX =g (v - eX)~ where ~X~ is replaced by the ionic
species.
Based on the respective sections on implementation details, we edit the syntax
trees to eliminate explicit definitions of ~vpeer~ and similar values, replacing
them with Arbor's built-in variables.
We then build individual blocks of the NMODL language from the AST by inspecting
dependencies of the state variables and building the appropriate chains of
expressions. All intermediates are stored as ~LOCAL~ to minimise memory
accesses.
The style defined by NML2, especially the
* Kinetic Schemes
Kinetic schemes describe ion channels by a set of discrete states and transition
probabilities between those states. Again we reproduce a simplified version of the
~ionChannelKS~ hierarchy.
#+begin_src xml
#+end_src
Apart from the type of ~gates~ this is functionally identical to ~ionChannelHH~.
Now we delve into the definitions of ~gateKS~
#+begin_src xml
#+end_src
Again, not much is new here, apart from ~KineticScheme~, a list of states is
given in ~states~ and their transition rates in ~transitions~.
Next, we look into the description of these transitons
#+begin_src xml
#+end_src
Again not much of a surprise here, but we do need to figure out what ~Link~
means. The final component is the linked state
#+begin_src xml
#+end_src
Now, let us consider a simple example from the NML2 sources.
#+begin_src xml
#+end_src
we can interpret this as
- define two populations transitioning between states open ~o1~ and closed ~c1~
- call these fractions ~o1_occupancy~ and ~c1_occupancy~
- conserve ~o1_occupancy + c1_occupancy = 1~ for all times
- transition ~c1 -> o1~ with rate ~ft/rate/r~
- transition ~o1 -> c1~ with rate ~rt/rate/r~
- calculate
- ~ft/rate/r~ :: as prescribed by ~HHExpLinearRate~
- ~rt/rate/r~ :: as prescribed by ~HHExpRate~
In NMODL we would like to generate a ~KINETIC~ block like this
#+begin_example
KINETIC scheme {
: ... snip ...
gates_n_states_o1_to_c1 = gates_n_transitions_ft_rr + gates_n_transitions_rt_rr
gates_n_states_c1_to_o1 = gates_n_transitions_ft_rf + gates_n_transitions_rt_rf
gates_m_states_o1_to_c1 = gates_m_transitions_ft_rr + gates_m_transitions_rt_rr
gates_m_states_c1_to_o1 = gates_m_transitions_ft_rf + gates_m_transitions_rt_rf
~ gates_n_states_o1_occupancy <-> gates_n_states_c1_occupancy (gates_n_states_o1_to_c1, gates_n_states_c1_to_o1)
~ gates_m_states_o1_occupancy <-> gates_m_states_c1_occupancy (gates_m_states_o1_to_c1, gates_m_states_c1_to_o1)
}
#+end_example
where multiple rates between the same two states were merged. Note that
- ~gates_n_transitions_ft_rr = 0~
- ~gates_n_transitions_ft_rf = 0~
- ~gates_m_transitions_ft_rr = 0~
- ~gates_m_transitions_ft_rf = 0~
** Question ~KineticScheme~
~KineticScheme~ seems to break the abstractions put in place. Looking at ~nodes~
and ~edges~ which _must_ be of kind ~children~, otherwise the semantics would
not work out. Similarly, both must be located directly under the scheme in the
hierarchy. However, a ~select~ statement would have expressed the same thing
more idiomatically?!
#+begin_src xml
#+end_src
* The Final Building Block
So far, we have dealt exclusively with channels, but NML2 requires some more levels
above this for completely specilying the dynamics
- Cell :: defines dynamics of membrane potential in terms of currents (mediated by channels)
- BioPhys Properties :: specifies capacitance, resistivities.
- Membrane Properties :: initial potential, channels.
- Channel Density :: gives the area density of a channel.
In order to interact with cable models like Arbor or Neuron, we need to pull
these layers into the channel descriptions -- partially at least. In the end we
would like to automatically compose Arbor simulations from NML2 files, but for
now, we have to extract those values manually.
Working our way up
#+begin_src xml
#+end_src
As noted under [[NMODL Export]] we compute ~iX = g(E - U)~ where ~g~ is the
~ionChannel~'s conductance, see above ~g = fopen * conductance~. The parameter
~condDensity~ added in ~channelDensityCond~ is identical to ~conductance~.
Thus, we need to retain ~conductance~ and set it to the value of ~condDensity~.
* Networks
Networks are important for us for both actual network support and simulation
inputs. We note the following about the network layout in NML2.
Instances for input and synapse appear at top-level, next to cells and networks.
Thus any =.nml= file is eligible to define them.
Networks are defined in terms of _populations_ and _projections_, see below.
*NOTE* Connections in NML2 use zero delay by default. This will break Arbor's
performance.
** Populations
Populations can be written in two ways inside a ~network~
- ~~ :: ~n~ duplicates of ~X~; using
~MultiInstantiate~ which seems to have no particular definition.
- ~~ :: give children as list of ~~ with
a 3D position, all have type ~X~.
For us, there seems to be no practical difference between them. In both cases
~X~ names a ~ComponentType~ deriving ~baseCell~.
** Projections
Projections specify pre- and post-synaptic populations and a connection type, eg
a synapse. Then, a list of connections is given as tuples =(pre: location, post:
location)= where =location= takes the form =(cell: id, segment: id, fraction:
[0, 1])=
Target cells are _selected_ by the relative paths and an index, eg
~target=../pop0[0]~ addresses the population ~pop0~, one tag up from here, and
picks cell zero from there. If ~pop0~ has a single member, the index is
optional. Alternatively it seems (?) we can address using ~pop//Cell~.
*** Questions
- Does ~[ix]~ query by id or offset?
It queries by offset into a flat array, like in ~population~.
- What's the difference between ~pop//Cell~ and ~/pop[]~?
- ~pop//Cell~ :: for ~populationList~ only. NB. in ~projection/connection~
entries the path is relative to ~projection~ *not* ~connection~.
- ~/pop[]~ :: see above
- Why do connections have to traverse the tree and do not work relative to
their pre/post populations? Seems counterintuitive.
It is, but ~jLEMS~ just is that way.
- How does Arbor translate segments into its internal rep? Can we directly use
it like ~branch~?
We have ~(segment )~ and can use ~(on-components (segment )
)~ to target locations.
*** Example
#+begin_src xml
#+end_src
** Inputs
- inputs are similar to (one-sided) connections
- targets are written as
- id, default = 0
- fraction, default = 0.5
- current stimuli derive =basePointCurrent=
- pulse, sine, cosine
- these we need to rewrite into ~i_clamp~
- relevant tags:
- =explicitInput= ::
- =inputList= ::
- /spike/ inputs on the other hand, become ~event_generator~
- these derive from ~baseSpikeSource~
*** Example
#+begin_src xml
#+end_src
** Connectivity: From NML2 to Arbor
In Arbor, we need these items to produce a spike-based connection
- Source :: a spike detector bound to a locset, eg
~d.place('(root)', A.spike_detector(-10), 'detector'))~
- Target :: a synapse bound to a locset, eg
~d.place('(terminal)', A.synapse('expsyn'), 'synapse')~
- Connection :: Returned by the ~connections_on(here)~ callback, eg
~[A.connection((there, 'detector'), 'synapse', weight, delay)]~
* Concentration Models
First, duplicate the ~ComponentType~ for reference with the usual simplifications
#+begin_src xml
#+end_src
We remove the hard-coded ~iCa~ requirement and replace it ad-hoc with ~iX~ where
~X~ is given in ~species~. Currently, concentration models *do not work in
Arbor*, since they would require access to ~area~, which is not exposed. We work
around this by approximating ~A = 4 pi r^2~ which assumes a given CV is roughly
spherical or a cylinder (we count only ever the mantle, not the contact faces to
adjacent CVs) that has the same diameter and height. The latter works out to be
the same formula as a sphere of the same radius.