= Markup example In this example, we want to take Litua input and write is as (somewhat valid) HTML5 output. * We assume the Litua input document uses calls which map to valid HTML5 tag names * We assume the Litua input document structure represents a valid HTML5 DOM == Equivalence of Litua and HTML5 So, consider the following document as output: [source,html] ----
Beauty & the beast
``. The function ``escape_text_for_xml`` returns its argument with escape sequences inserted. [source,lua] ---- local function is_valid_xml_element_name_or_attribute(name) return true or false end local function escape_text_for_xml(str) return str:gsub("&", "&"):gsub("<", "<"):gsub(">", ">"):gsub("'", "'"):gsub('"', """) end ---- It is interesting to discover that the ``gsub`` order in ``escape_text_for_xml`` is significant. == An implementation idea One implementation idea is to use the ``convert-node-to-string`` hook. For which calls shall this hook be applied? All calls. There is the special filter ``""`` (empty string) which results in calling the hook for all calls. Thus for this filter, we provide a function which represents a node ``p`` with attribute ``style`` as ``text-align:center`` and text node ``paragraph`` as ``paragraph
``. An implementation could then look as follows: [source,lua] ---- Litua.convert_node_to_string("", function (node) -- attach element name local out = "<" .. node.call -- attach attributes local attributes = "" for attr, values in pairs(node.args) do local value = "" for i=1,#values do value = value .. tostring(values[i]) end -- NOTE: skip attributes like "=whitespace" which are provided -- as special attributes by the lexer if attr:find("^=") == nil then attributes = attributes .. " " .. attr .. '="' .. escape_text_for_xml(value) .. '"' end end if #node.content == 0 then -- empty element return out .. attributes .. " /" .. ">", nil else out = out .. attributes .. ">" end -- attach content for _, content in ipairs(node.content) do out = out .. escape_text_for_xml(tostring(content)) end -- attach closing xml element return out .. "" .. node.call .. ">", nil end) ---- == A problem But you will soon recognize a problem once you run the example with a nested structure. For example … ---- {main {p Hello World}} ---- … will be represented by … ----Hello World
``. But now this result will be supplied as text to the hook for call ``main``. In this second call, it will be escaped. == Solving the problem by substitution We need to prevent escaping to prevent these errors. For my implementation, I approached with a dirty, but simple mechanism: We replace the symbols ``<``, ``>``, and ``"`` occuring in XML notation with characters which do not usually occur in text. Specifically, we define: [source,lua] ---- local SUB_ELEMENT_START = "\x02" -- substitutes < local SUB_ELEMENT_END = "\x03" -- substitutes > local SUB_ATTR_START = "\x0F" -- substitutes " local SUB_ATTR_END = "\x0E" -- substitutes " ---- Now, usually we introduce the XML notation with those substitution characters. Successively, they will not be replaced because the XML characters to escape (``<>&"'``) do not include those characters. When we invoke the top-level element ``document``, we can replace any to those characters with their XML counterpart. The result is the implementation given. A more beautiful approach would be to introduce a custom type ``XMLElement`` with a metatable overwriting ``tostring``. We provide a hook for ``modify-node`` which replaces node with a ``XMLElement`` value which escapes string children and turns node children into ``XMLElement`` as well. This should work, but I did not spend the time to do it this right way yet. == Final output file [source,html] ----