The Document Object Model (DOM) represents elements, attributes and text within elements as nodes in a tree structure.
When an XmlDocument object parses an XML document in string form (XmlDocument. ToString () or XmlDocument. ToString (True)), it creates a tree of XmlNode objects. These XmlNode objects typically have a data type that is not exactly XmlNode, but inherits from the XmlNode class, such as XmlTextNode for text nodes or XmlElement for tags. If a node k of this tree belongs to a tag (“<tag>… </tag>” or “<tag />” in the case of an empty tag), you name k an element and it is of type XmlElement. In XML, only elements can have child nodes. So if a node in this tree has child nodes, it is necessarily of type XmlElement. The child nodes of the element k are then exactly the XML nodes that are defined within “<tag>….</tag>” are nested. Therefore, the XmlElement has a special meaning and there are extra versions of certain methods (such as FirstChildElement from FirstChild). The xml-elements are nodes where the tree grows in depth and all non-element nodes are leaves of the tree.
This XML structure
<tag> <tag2>abc</tag2> <tag3 /> DEF </tag>
would formally produce the following tree, whereby the indentation represents the parent-child relationship:
XmlElement: tag +-- XmlElement: tag2 +-- XmlTextNode: abc +-- XmlElement: tag3 +-- XmlTextNode: DEF
At XmlElement nodes the tree can be nested deeper than with' tag' and' tag2' or not - as with the empty' tag3'. However, it is impossible that there are child nodes under XmlTextNode nodes!
The key to working with the class XmlDocument is to imagine the document as a tree of XML nodes. If you want to add an element, the root of the tree (that is, the XmlDocument object) does not help you. You must go to its node (XmlNode or XmlElement). You can only add new nodes to the tree there. This is reflected in the fact that XmlDocument has almost no methods for modifying the tree. These methods are almost all bound to XmlElement and XmlNode.
Incidentally, the relationship between XmlNode and XmlElement is the following: XmlNode is the parent class of XmlElement. For example, an XmlNode can be a comment, a text, a CDATA or an element in the narrower sense. XmlElement means a tag “<name>contents</name>”. An XmlElement is therefore in particular something that can have child nodes. This explains why it is often referred to as XmlElement objects in connection with the modification of the tree.
Each XmlElement is an XmlNode, but not every node (XmlNode) is an XmlElement. XmlElement is only a type of XmlNode. Others are for example XmlAttribute or XmlText. An element is therefore part of the formal definition of a well-formed XML document, while a node is defined as part of the document object model for processing XML documents. A node is a part of the DOM tree, an element is a particular type of node.
It is advantageous when analyzing with XML files if you know their structure as a tree of XML nodes or if you have information about the individual nodes. This chapter presents two projects that do just that:
Figure 27.1.1.1: Project 1: View Node Structure - displayed in a TreeView
You can create objects of the class XmlDocument:
Dim hXmlDocument As XmlDocument hXmlDocument = New XmlDocument([ FileName As String ])
The following source code
Public Sub btnXMLWriteDOM_Click() Dim hXMLContent As String hXMLDocument = New XmlDocument hXMLContent = hXMLDocument.ToString(True) txaXML.Text = hXMLContent File.Save(xmlFilePath, hXMLContent) End
creates the content of an XML document with a predefined root element <xml>..</xml>:
<?xml version="1.0" encoding="UTF-8"?> <xml> </xml>
How to create a user-defined root element is described in example 27.1.4.
The class XmlDocument has these properties:
Property | Data type | Description |
---|---|---|
All | XmlNode[] | Returns a node array with all nodes of the DOM tree. |
Content | String | Returns the complete content of the document or sets the content. |
Root | XmlElement | Returns the root element or resets the root element. |
Table 27.1.2.1.1.1: Properties of the class XmlDocument
The class XmlDocument has these methods:
Method | Return type | Description |
---|---|---|
Open (FileName As String) | - | Opens an XML document with the path' FileName'. Save (FileName As String[, Indent As Boolean]).Save the content of the XML document under the file path' FileName'. If the optional parameter' Indent' is specified with the value True, the elements are indented appropriately. |
CreateElement (TagName As String) | XmlElement | The function creates a new element called' TagName' and returns it as an xmlElement. |
ToString ([ Indent As Boolean]) | String | The function returns the content of the XML document as a string. If the optional parameter' Indent' is specified with the value True, the elements are indented. |
FromString (Data As String) | - | The XML document is filled with the content' Data'. Existing content is overwritten. |
HtmlFromString (Data As String) | - | The XML document is filled with the content' Data'. Existing content is overwritten. |
GetElementsByNamespace (Namespace As String[, Mode As Integer, Depth As Integer]) | XmlElement[] | Returns all elements of the document in an array whose element name matches the parameter value' Namespace'. The mode parameter defines the comparison method used. Supported are GB. binary, GB. IgnoreCase and GB. Like. The Depth argument defines where to stop the search: If a negative value is only stopped at the end of the tree (default), 1 checks only the root element, 2 checks only the direct children of the root element and so on. |
GetElementsByTagName (TagName As String[, Mode As Integer, Depth As Integer]) | XmlElement[] | Returns all elements of the document in an array whose element name matches the parameter value' TagName'. The mode parameter defines the comparison method used. Supported are GB. binary, GB. IgnoreCase and GB. Like. The Depth argument defines where to stop the search: If a negative value is only stopped at the end of the tree (default), 1 checks only the root element, 2 checks only the direct children of the root element and so on. |
Table 27.1.2.2.2.1: Methods of the class XmlDocument
The node class describes the base type for the entire document object model (DOM). It represents a single node in the document tree. The class XmlNode has the following properties:
Property | Data type | Description |
---|---|---|
Attributes | . XmlElementAttributes | Returns a node array with all nodes of the DOM tree. Returns a virtual collection of all attribute nodes of the current node. The data type is described by the class XmlElementAttributes, which has the properties Count, Name and Value. |
Children | XmlNode[] | Returns an array containing all child nodes. If there are no child nodes, an empty array is returned. |
ChildNodes | XmlNode[] | Synonym for the Children property. |
Element | XmlElement | If the current node is an element, then the element is returned. Otherwise, NULL is returned. If you only want to know whether the current node is an element, you can check this with the property XmlNode. IsElement. |
IsCDATA | Boolean | This property returns True if the current node is a CDATA node, otherwise False. |
IsComment | Boolean | This property returns True if the current node is a comment node, otherwise False. |
IsElement | Boolean | This property returns True if the current node is an element node, otherwise False. |
IsText | Boolean | This property returns True if the current node is a text node, otherwise False. |
Name | String | The name of the node is returned. Note: The value of this property can vary depending on the node type: Element node: The element name is returned. Text node: Returns #text. CDATA node: Returns #cdata. Comment node: Returns #comment. Attribute node: The name of the attribute node is returned. |
TextContent | String | Returns or sets the text content of a node. For text nodes, comments, CDATA sections or attributes, it is the value of the node. For nodes that may have children, this is a concatenation of the TextContent property value of each child node. This does not apply to comments. This is the empty string if the node has no children. For nodes that may have children, setting this property will remove all children from the node! Setting this property with a null string has no effect. |
Value | String | Synonym for the TextContent property. |
Parent | XmlElement | Returns the parent element of the current node. All nodes can have one parent. However, if a node has just been created and has not yet been added to the tree or if it has been removed from the tree, this property returns NULL. |
Previous | XmlNode | Returns the node immediately before the current node. If there is no such node, this property returns NULL. |
PreviousSibling | XmlNode | Synonym for the Previous property. |
Next | XmlNode | Returns the node immediately after the current node. If there is no such node, this property returns NULL. |
NextSibling | XmlNode | Synonym for the Next property. |
OwnerDocument | XmlDocument | Returns the document object associated with the current node. If there is no document object associated with the node, this property returns NULL. In contrast to the W3C specification, nodes can exist without a document being assigned, since they can be easily created with the NEW command or an XmlReader. However, a document is automatically assigned to the node if the AppendChild (or sibling) method is used by an element to which a document object is already assigned. |
Table 27.1.3.1: Properties of the class XmlNode
The class XmlNode has these constants:
The class XmlNode has these methods:
Method | Return type | Description |
---|---|---|
NewElement (EName As String[, EValue As String]) | . | Creates a new element node, sets (optional) the element name and value with the arguments' EName' and' EValue' and adds it to the current node. If the node is not an element, this method has no effect. |
NewAttribute (AName As String, AValue As String) | - | -Creates a new attribute node, sets the attribute name and value with the arguments' AName' and' AValue' and adds it to the current node. If the node is not an element, this method has no effect. |
ToString ([ Indent As Boolean] | String | Returns a string representation of the current node, since it is a regular XML file. If the indent feature is set to True, the output is indented. By default, this optional' Indent' argument is set to False. |
SetUserData (Key As String, Value As Variant) | - | Connects an object with a key on this node. |
GetUserData (Key As String) | Variant | Get the object to which a key is assigned on this node. |
Table 27.1.3.1.1: Methods of the class XmlNode
The following section contains notes on the methods SetUserData (key as string, value as variant) and GetUserData (key as string):
Comments:
In the following sections you will be introduced to different projects. This involves writing, reading (parsing) and modifying the content of XML documents, mainly using the class XmlDocument. The content of the XML documents is either static or generated at runtime.
Project 1 (xmldom_write) shows you how to rewrite an XML document using the class XmlDocument. The data is made available in a structure (data type Struct) within the program. Among other things, this data type saves the advantage of addressing the individual data by name instead of using an anonymous index. Only the most important procedure is presented to you:
Private Sub WriteXMLDOM() Dim hXMLComment As XmlCommentNode Dim hXMLElement As XmlElement Dim hXMLContent As String Dim i As Integer hXMLDocument = New XmlDocument ' Statt <xml>...</xml> a correct but freely defined root element is created! hXMLDocument.FromString("<?xml version=\"1.0\" encoding=\"utf-8\" ?><contact />") hXMLComment = New XmlCommentNode("Data-Base: Data-Array aDataSet (DataSet[])") hXMLDocument.Root.AppendChild(hXMLComment) For i = 0 To aDataSet.Max ' Add the element <contact> to the tree and set hXMLElement to this element. hXMLElement = New XmlElement("contact") hXMLDocument.Root.AppendChild(hXMLElement) ' <firstname> has only one value and does not nest further. hXMLElement.NewElement("firstname", aDataSet[i].Firstname) ' <surname> has only one value and does not nest further. hXMLElement.NewElement("surname", aDataSet[i].Surname) ' The <address> element has child elements --> descend hXMLElement.NewElement("address") hXMLElement = hXMLElement.LastChildElement hXMLElement.NewElement("street", aDataSet[i].Street) hXMLElement.NewElement("residence") ' Two attributes are to be set --> descend. hXMLElement = hXMLElement.LastChildElement hXMLElement.NewAttribute("postcode", aDataSet[i].Postcode) hXMLElement.NewAttribute("location", aDataSet[i].Location) ' The Element <residence> is written --> ascend hXMLElement = hXMLElement.Parent ' </residence> hXMLElement = hXMLElement.Parent ' </address> hXMLElement.NewElement("day", Day(aDataSet[i].BirthDate)) hXMLElement.NewElement("month", Month(aDataSet[i].BirthDate)) hXMLElement.NewElement("year", Year(aDataSet[i].BirthDate)) hXMLElement.NewElement("kommunication") hXMLElement = hXMLElement.LastChildElement hXMLElement.NewElement("fixed network", aDataSet[i].TFixed network) hXMLElement.NewElement("mobil", aDataSet[i].TMobil) hXMLElement.NewElement("internet") hXMLElement = hXMLElement.LastChildElement hXMLElement.NewElement("email", aDataSet[i].EMail) hXMLElement.NewElement("web", aDataSet[i].Web) ' The following 3 assignments are not necessary because hXMLElement is no longer ' is processed. The DOM tree is also correct without these 3 following assignments. ' It is only shown along which tags one comes back to the root of the DOM tree.. hXMLElement = hXMLElement.Parent ' </internet> hXMLElement = hXMLElement.Parent ' </kommunication> hXMLElement = hXMLElement.Parent ' </contact> Next hXMLElement = hXMLElement.Parent ' </kontacts> hXMLContent = hXMLDocument.ToString(True) ' The complete XML document is created txaXML.Text = hXMLContent ' The XML document content is displayed. File.Save(xmlFilePath, hXMLContent) ' The XML document content is saved. End
While the XmlDocument is in memory, all properties and contents of the nodes are stored in the properties of the XML Gambas objects used. Only a method like XmlDocument. ToString () leaves this tree and recursively produces a well-formed XML document of the data type String from all attributes, properties and child nodes.
Figure 27.1.4.1.1: Project 1: Note on the data basis
Figure 27.1.4.1.1.2: Project 1: Section Contents of XmlDocument
In the presented project (xmldom_read_addresses) selected data from contact data - stored in an XML file - are to be read out, edited and then displayed in a TextArea. The data read out, prepared and saved in a text file could, for example, serve as the basis for printing address labels:
Figure 27.1.4.2.1: Project 2: XML DOM parser
The source code is completely specified and commented on:
[1] ' Gambas class file [2] [3] Public hXMLDocument As XmlDocument [4] Public sXMLPath As String = "files/list.xml" [5] [6] Public Sub Form_Open() [7] ShowXMLContent() [8] txaXML.Pos = 0 [9] HSplit1.Layout = [5, 2] [10] End [11] [12] Public Sub btnShowRecords_Click() [13] [14] Dim i As Integer [15] Dim asMatrix As String[] [16] Dim avRecords As New Variant[] [17] Dim sSpace As String = String$(4, " ") [18] [19] avRecords = GetRecords() [20] [21] For i = 0 To avRecords.Max [22] asMatrix = New String[] [23] asMatrix = avRecords[i] [24] txaList.Insert(sSpace & asMatrix[0] & gb.NewLine) [25] txaList.Insert(sSpace & asMatrix[1] & " " & asMatrix[2] & gb.NewLine) [26] txaList.Insert(sSpace & asMatrix[3] & gb.NewLine) [27] txaList.Insert(sSpace & asMatrix[4] & " " & asMatrix[5] & gb.NewLine) [28] txaList.Insert(gb.NewLine) [29] Next [30] [31] File.Save(Application.Path &/ "files/addresslist.txt", txaList.Text) [32] [33] End [34] [35] Private Sub ShowXMLContent() [36] hXMLDocument = New XmlDocument [37] hXMLDocument.Open(sXMLPath) [38] txaXML.Insert(hXMLDocument.Content) [39] End [40] [41] Private Function GetRecords() As Variant[] [42] [43] Dim i As Integer [44] Dim xeElements As XmlElement[] [45] Dim asRecord As String[] [46] Dim avRecords As New Variant[] [47] [48] txaList.Clear() [49] [50] hXMLDocument = New XmlDocument [51] hXMLDocument.Open(sXMLPath) [52] [53] xeElements = New XmlElement[] [54] xeElements = hXMLDocument.GetElementsByTagName("kontakt") [55] For i = 0 To xeElements.Max [56] asRecord = New String[] [57] asRecord.Add(xeElements[i].GetChildrenByTagName("first name")[0].Value) [58] asRecord.Add(xeElements[i].GetChildrenByTagName("surname")[0].TextContent) [59] asRecord.Add(xeElements[i].GetChildrenByTagName("street")[0].TextContent) [60] asRecord.Add(xeElements[i].GetChildrenByTagName("residence")[0].Attributes["plz"]) [61] asRecord.Add(xeElements[i].GetChildrenByTagName("residence")[0].Attributes["ort"]) [62] If xeElements[i].GetChildrenByTagName("mw")[0].Value = "w" Then [63] asRecord.Add("Ms.", 0) [64] Else [65] asRecord.Add("Mr.", 0) [66] Endif [67] avRecords.Add(asRecord) [68] Next [69] [70] Return avRecords [71] [72] End
Comment:
In contrast to the first two projects, Project 3 (xmldom_read_call) uses an HTTP client to dynamically import the data from a special database for amateur radio call characters into an XML document of type XmlDocument and display it in a TextArea. A parser extracts selected data from the XML document in 10 different variants and inserts it into the TextArea. The third project was developed, coded and tested by Claus Dietrich.
The project xmldom_read_write_modification specifies an XML file whose content is to be modified. The system then checks against an XML schema - stored in the magazine_m. xsd XSD file - whether the changed XML file is a valid document. The program xmllint is used for checking in a shell instruction:
Figure 27.1.4.4.4.1: Project 4: XML DOM update
All above mentioned projects as well as further projects can be found as archives in the download area.