Latest revision as of 01:26, 29 January 2020

External

Document Object Model (DOM) Level 3 Core Specification https://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/

Internal

JAXP DOM

Overview

All JAXP DOM interfaces are part of the org.w3c.dom package.

For an example of how to walk a DOM tree, see:

https://github.com/NovaOrdis/playground/tree/master/java/xml/dom-reading

Document

Node

The process of navigating to a node involves processing sub-elements, ignoring the uninteresting ones and inspecting the interesting ones, recursively. A robust DOM application must do these things:

When searching for an element
- ignore comments, attributes and processing instructions
- allow for the possibility that sub-elements do not occur in the expected order
- skip over TEXT nodes that contain ignorable white space. Warning new lines in the file are returned as text, so they have to be handled.
When extracting text for a node:
- extract text from CDATA as well as text nodes
- ignore comments, attributes and processing instructions when gathering text
- if an entity reference node or another element node is encountered, recurse.

Node Types

A node type can be obtained with getNodeType() call, and it is one of the following:

ELEMENT_NODE

Node type: 1 (Node.ELEMENT_NODE)

ATTRIBUTE_NODE

Node type: 2

TEXT_NODE

Node type: 3

CDATA_SECTION_NODE

Node type: 4

ENTITY_REFERENCE_NODE

Node type: 5

ENTITY_NODE

Node type: 6

PROCESSING_INSTRUCTION_NODE

Node type: 7

COMMENT_NODE

Node type: 8

DOCUMENT_NODE

Node type: 9

DOCUMENT_TYPE_NODE

Node type: 10

DOCUMENT_FRAGMENT_NODE

Node type: 11

NOTATION_NODE

Node type: 12

Node Name, Value and Attributes

Interface	nodeName	nodeValue	attributes
Element	`Element.tagName`	null	NamedNodeMap
Text	"#text"	same as CharacterData.data, the content of the text node	null
Attr	same as `Attr.name`	same as `Attr.value`	null
CDATASection	"#cdata-section"	same as `CharacterData.data`, the content of the CDATA Section	null
Comment	"#comment"	same as `CharacterData.data`, the content of the comment	null
Document	"#document"	null	null
DocumentFragment	"#document-fragment"	null	null
DocumentType	same as `DocumentType.name`	null	null
Entity	entity name	null	null
Notation	notation name	null	null
ProcessingInstruction	same as `ProcessingInstruction.target`	same as `ProcessingInstruction.data`	null

Node's Text Content

To get the text a node contains, you need to look through the list of child nodes, ignoring entries that are of no concern and accumulating the text you find in TEXT nodes, CDATA nodes, and EntityRef nodes.

Element

An Element extends a Node and it has node type of 1.

@@ Line 8: / Line 8: @@
 =Overview=
+All JAXP DOM interfaces are part of the <code>org.w3c.dom</code> package.
 For an example of how to walk a DOM tree, see:
@@ Line 18: / Line 20: @@
 =Node=
+The process of navigating to a node involves processing sub-elements, ignoring the uninteresting ones and inspecting the interesting ones, recursively. A robust DOM application must do these things:
+* When searching for an element
+** ignore comments, attributes and processing instructions
+** allow for the possibility that sub-elements do not occur in the expected order
+** skip over TEXT nodes that contain ignorable white space. '''Warning''' new lines in the file are returned as text, so they have to be handled.
+* When extracting text for a node:
+** extract text from CDATA as well as text nodes
+** ignore comments, attributes and processing instructions when gathering text
+** if an entity reference node or another element node is encountered, recurse.
 ==Node Types==
@@ Line 24: / Line 36: @@
 ===ELEMENT_NODE===
+Node type: 1 (Node.ELEMENT_NODE)
 ===ATTRIBUTE_NODE===
+Node type: 2
 ===TEXT_NODE===
+Node type: 3
 ===CDATA_SECTION_NODE===
+Node type: 4
+===ENTITY_REFERENCE_NODE===
+Node type: 5
+===ENTITY_NODE===
+Node type: 6
+===PROCESSING_INSTRUCTION_NODE===
+Node type: 7
 ===COMMENT_NODE===
+Node type: 8
-===DOCUMENT_FRAGMENT_NODE===
 ===DOCUMENT_NODE===
+Node type: 9
 ===DOCUMENT_TYPE_NODE===
+Node type: 10
-===ENTITY_NODE===
+===DOCUMENT_FRAGMENT_NODE===
+Node type: 11
-===ENTITY_REFERENCE_NODE===
 ===NOTATION_NODE===
+Node type: 12
-===PROCESSING_INSTRUCTION_NODE===
 ==Node Name, Value and Attributes==
 {|
-| '''Interface''' || '''nodeName''' || '''nodeValue''' || attributes
+| '''Interface''' || '''nodeName''' || '''nodeValue''' || '''attributes'''
 |-
 | Element || <tt>Element.tagName</tt> || null || NamedNodeMap
+|-
+| Text || "#text" || same as CharacterData.data, the content of the text node || null
+|-
+| Attr || same as <tt>Attr.name</tt> || same as <tt>Attr.value</tt> || null
+|-
+| CDATASection || "#cdata-section" || same as <tt>CharacterData.data</tt>, the content of the CDATA Section || null
+|-
+| Comment  || "#comment" || same as <tt>CharacterData.data</tt>, the content of the comment  || null
+|-
+| Document  || "#document" || null  || null
+|-
+| DocumentFragment  || "#document-fragment" || null || null
+|-
+| DocumentType || same as <tt>DocumentType.name</tt> || null || null
+|-
+| Entity  || entity name || null || null
+|-
+| Notation || notation name || null || null
+|-
+| ProcessingInstruction || same as <tt>ProcessingInstruction.target</tt> || same as <tt>ProcessingInstruction.data</tt> || null
 |-
 |}
+==Node's Text Content==
+To get the text a node contains, you need to look through the list of child nodes, ignoring entries that are of no concern and accumulating the text you find in TEXT nodes, CDATA nodes, and EntityRef nodes.
+=Element=
+An Element extends a [[#Node|Node]] and it has node type of 1.
-  <table border='1' cellpadding='3'>
- <tr>
-  <th>Interface</th>
-  <th>nodeName</th>
-  <th>nodeValue</th>
-  <th>attributes</th>
-  </tr>
-  <tr>
-  <td valign='top' rowspan='1' colspan='1'>
-  <code>Attr</code></td>
-  <td valign='top' rowspan='1' colspan='1'>same as <code>Attr.name</code></td>
-  <td valign='top' rowspan='1' colspan='1'>same as
-  <code>Attr.value</code></td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  </tr>
-  <tr>
-  <td valign='top' rowspan='1' colspan='1'><code>CDATASection</code></td>
- <td valign='top' rowspan='1' colspan='1'>
-  <code>"#cdata-section"</code></td>
-  <td valign='top' rowspan='1' colspan='1'>same as <code>CharacterData.data</code>, the
-  content of the CDATA Section</td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  </tr>
-  <tr>
-  <td valign='top' rowspan='1' colspan='1'><code>Comment</code></td>
-  <td valign='top' rowspan='1' colspan='1'>
-  <code>"#comment"</code></td>
-  <td valign='top' rowspan='1' colspan='1'>same as <code>CharacterData.data</code>, the
-  content of the comment</td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  </tr>
-  <tr>
-  <td valign='top' rowspan='1' colspan='1'><code>Document</code></td>
-  <td valign='top' rowspan='1' colspan='1'>
- <code>"#document"</code></td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  </tr>
-  <tr>
-  <td valign='top' rowspan='1' colspan='1'>
-  <code>DocumentFragment</code></td>
-  <td valign='top' rowspan='1' colspan='1'><code>"#document-fragment"</code></td>
-  <td valign='top' rowspan='1' colspan='1'>
-  <code>null</code></td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  </tr>
-  <tr>
-  <td valign='top' rowspan='1' colspan='1'><code>DocumentType</code></td>
-  <td valign='top' rowspan='1' colspan='1'>same as
-  <code>DocumentType.name</code></td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  </tr>
-  <tr>
-  <td valign='top' rowspan='1' colspan='1'><code>Entity</code></td>
-  <td valign='top' rowspan='1' colspan='1'>entity name</td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  <td valign='top' rowspan='1' colspan='1'>
-  <code>null</code></td>
-  </tr>
-  <tr>
-  <td valign='top' rowspan='1' colspan='1'><code>EntityReference</code></td>
-  <td valign='top' rowspan='1' colspan='1'>name of entity referenced</td>
-  <td valign='top' rowspan='1' colspan='1'>
-  <code>null</code></td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  </tr>
-  <tr>
-  <td valign='top' rowspan='1' colspan='1'><code>Notation</code></td>
-  <td valign='top' rowspan='1' colspan='1'>notation name</td>
-  <td valign='top' rowspan='1' colspan='1'>
-  <code>null</code></td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  </tr>
-  <tr>
-  <td valign='top' rowspan='1' colspan='1'><code>ProcessingInstruction</code></td>
-  <td valign='top' rowspan='1' colspan='1'>same
-  as <code>ProcessingInstruction.target</code></td>
-  <td valign='top' rowspan='1' colspan='1'>same as
-  <code>ProcessingInstruction.data</code></td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  </tr>
-  <tr>
-  <td valign='top' rowspan='1' colspan='1'><code>Text</code></td>
-  <td valign='top' rowspan='1' colspan='1'>
-  <code>"#text"</code></td>
-  <td valign='top' rowspan='1' colspan='1'>same as <code>CharacterData.data</code>, the content
-  of the text node</td>
-  <td valign='top' rowspan='1' colspan='1'><code>null</code></td>
-  </tr>
-</table>
-==Node's Text Content==
-To get the text a node contains, you need to look through the list of child nodes, ignoring entries that are of no concern and accumulating the text you find in TEXT nodes, CDATA nodes, and EntityRef nodes.

JAXP DOM Reference: Difference between revisions

Latest revision as of 01:26, 29 January 2020

Contents

External

Internal

Overview

Document

Node

Node Types

ELEMENT_NODE

ATTRIBUTE_NODE

TEXT_NODE

CDATA_SECTION_NODE

ENTITY_REFERENCE_NODE

ENTITY_NODE

PROCESSING_INSTRUCTION_NODE

COMMENT_NODE

DOCUMENT_NODE

DOCUMENT_TYPE_NODE

DOCUMENT_FRAGMENT_NODE

NOTATION_NODE

Node Name, Value and Attributes

Node's Text Content

Element

Navigation menu

JAXP DOM Reference: Difference between revisions

Latest revision as of 01:26, 29 January 2020

External

Internal

Overview

Document

Node

Node Types

ELEMENT_NODE

ATTRIBUTE_NODE

TEXT_NODE

CDATA_SECTION_NODE

ENTITY_REFERENCE_NODE

ENTITY_NODE

PROCESSING_INSTRUCTION_NODE

COMMENT_NODE

DOCUMENT_NODE

DOCUMENT_TYPE_NODE

DOCUMENT_FRAGMENT_NODE

NOTATION_NODE

Node Name, Value and Attributes

Node's Text Content

Element

Navigation menu

Search