# XML Structure and Parsing ## Specifications Th DOM * DOM Level 1 provided a complete model for an entire HTML or XML document, including the means to change any portion of the document. * DOM Level 2 was published in late 2000. It introduced the getElementById function as well as an event model and support for XML namespaces and CSS. * DOM Level 3, published in April 2004, added support for XPath and keyboard event handling, as well as an interface for serializing documents as XML. * DOM Level 4 was published in 2015. It is a snapshot of the WHATWG living standard. * [_Document Object Model (DOM) Level 1 Specification_](https://www.w3.org/TR/REC-DOM-Level-1/), Version 1.0, W3C Recommendation 1 October, 1998. * [_Document Object Model (DOM) Level 2 Core Specification_](https://www.w3.org/TR/DOM-Level-2-Core/), Version 1.0, W3C Recommendation, 13 November, 2000. * [_Document Object Model (DOM) Level 3 Core Specification_](https://www.w3.org/TR/DOM-Level-3-Core/), Version 1.0, W3C Recommendation, 07 April 2004. * [_XML Information Set (Second Edition)_](https://www.w3.org/TR/xml-infoset/), W3C Recommendation, 4 February 2004. * [_Extensible Markup Language (XML) 1.0 (Fifth Edition)_](https://www.w3.org/TR/REC-xml/), W3C Recommendation, 26 November 2008. * [_Extensible Markup Language (XML) 1.1 (Second Edition)_](https://www.w3.org/TR/xml11/), W3C Recommendation, 16 August 2006, edited in place 29 September 2006 * [_Namespaces in XML 1.1 (Second Edition)_](https://www.w3.org/TR/xml-names11/), W3C Recommendation, 16 August 2006. * [_xml:id Version 1.0_](https://www.w3.org/TR/xml-id), W3C Recommendation, 9 September 2005. Especially ยง7.1 _Conformance to xml:id_. * [_XML Base (Second Edition)_](https://www.w3.org/TR/xmlbase/), W3C Recommendation, 28 January 2009. * [_The "xml" Namespace_](https://www.w3.org/XML/1998/namespace), W3C, 26 October 2009. ## XML 1.1 EBNF ```ebnf [1] document ::= prolog element Misc* - Char* RestrictedChar Char* [2] Char ::= [#x1-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */ [2a] RestrictedChar ::= [#x1-#x8] | [#xB-#xC] | [#xE-#x1F] | [#x7F-#x84] | [#x86-#x9F] [3] S ::= (#x20 | #x9 | #xD | #xA)+ [4] NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] [4a] NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040] [5] Name ::= NameStartChar (NameChar)* [6] Names ::= Name (#x20 Name)* [7] Nmtoken ::= (NameChar)+ [8] Nmtokens ::= Nmtoken (#x20 Nmtoken)* [9] EntityValue ::= '"' ([^%&"] | PEReference | Reference)* '"' | "'" ([^%&'] | PEReference | Reference)* "'" [10] AttValue ::= '"' ([^<&"] | Reference)* '"' | "'" ([^<&'] | Reference)* "'" [11] SystemLiteral ::= ('"' [^"]* '"') | ("'" [^']* "'") [12] PubidLiteral ::= '"' PubidChar* '"' | "'" (PubidChar - "'")* "'" [13] PubidChar ::= #x20 | #xD | #xA | [a-zA-Z0-9] | [-'()+,./:=?;!*#@$_%] [14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*) [15] Comment ::= '' [16] PI ::= '' Char*)))? '?>' [17] PITarget ::= Name - (('X' | 'x') ('M' | 'm') ('L' | 'l')) [18] CDSect ::= CDStart CData CDEnd [19] CDStart ::= '' Char*)) [21] CDEnd ::= ']]>' [22] prolog ::= XMLDecl Misc* (doctypedecl Misc*)? [23] XMLDecl ::= '' [24] VersionInfo ::= S 'version' Eq ("'" VersionNum "'" | '"' VersionNum '"') [25] Eq ::= S? '=' S? [26] VersionNum ::= '1.1' [27] Misc ::= Comment | PI | S [28] doctypedecl ::= '' [28a] DeclSep ::= PEReference | S [28b] intSubset ::= (markupdecl | DeclSep)* [29] markupdecl ::= elementdecl | AttlistDecl | EntityDecl | NotationDecl | PI | Comment [30] extSubset ::= TextDecl? extSubsetDecl [31] extSubsetDecl ::= ( markupdecl | conditionalSect | DeclSep)* [32] SDDecl ::= #x20+ 'standalone' Eq (("'" ('yes' | 'no') "'") | ('"' ('yes' | 'no') '"')) (Productions 33 through 38 have been removed in XML 1.1.) [39] element ::= EmptyElemTag | STag content ETag [40] STag ::= '<' Name (S Attribute)* S? '>' [41] Attribute ::= Name Eq AttValue [42] Attribute ::= Name Eq AttValue [43] content ::= CharData? ((element | Reference | CDSect | PI | Comment) CharData?)* [44] EmptyElemTag ::= '<' Name (S Attribute)* S? '/>' [45] elementdecl ::= '' [46] contentspec ::= 'EMPTY' | 'ANY' | Mixed | children [47] children ::= (choice | seq) ('?' | '*' | '+')? [48] cp ::= (Name | choice | seq) ('?' | '*' | '+')? [49] choice ::= '(' S? cp ( S? '|' S? cp )+ S? ')' [50] seq ::= '(' S? cp ( S? ',' S? cp )* S? ')' [51] Mixed ::= '(' S? '#PCDATA' (S? '|' S? Name)* S? ')*' | '(' S? '#PCDATA' S? ')' [52] AttlistDecl ::= '' [53] AttDef ::= S Name S AttType S DefaultDecl [54] AttType ::= StringType | TokenizedType | EnumeratedType [55] StringType ::= 'CDATA' [56] TokenizedType ::= 'ID' | 'IDREF' | 'IDREFS' | 'ENTITY' | 'ENTITIES' | 'NMTOKEN' | 'NMTOKENS' [57] EnumeratedType ::= NotationType | Enumeration [58] NotationType ::= 'NOTATION' S '(' S? Name (S? '|' S? Name)* S? ')' [59] Enumeration ::= '(' S? Nmtoken (S? '|' S? Nmtoken)* S? ')' [60] DefaultDecl ::= '#REQUIRED' | '#IMPLIED' | (('#FIXED' S)? AttValue) [61] conditionalSect ::= includeSect | ignoreSect [62] includeSect ::= '' [63] ignoreSect ::= '' [64] ignoreSectContents::= Ignore ('' Ignore)* [65] Ignore ::= Char* - (Char* ('') Char*) [66] CharRef ::= '&#' [0-9]+ ';' | '&#x' [0-9a-fA-F]+ ';' [67] Reference ::= EntityRef | CharRef [68] EntityRef ::= '&' Name ';' [69] PEReference ::= '%' Name ';' [70] EntityDecl ::= GEDecl | PEDecl [71] GEDecl ::= '' [72] PEDecl ::= '' [73] EntityDef ::= EntityValue| (ExternalID NDataDecl?) [74] PEDef ::= EntityValue | ExternalID [75] ExternalID ::= 'SYSTEM' S SystemLiteral | 'PUBLIC' S PubidLiteral S SystemLiteral [76] NDataDecl ::= S 'NDATA' S Name [77] TextDecl ::= '' [78] extParsedEnt ::= TextDecl? content - Char* RestrictedChar Char* [80] EncodingDecl ::= S 'encoding' Eq ('"' EncName '"' | "'" EncName "'" ) [81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')* [82] NotationDecl ::= '' [83] PublicID ::= 'PUBLIC' S PubidLiteral ```