JAXP DOM Reference: Difference between revisions
Jump to navigation
Jump to search
Line 43: | Line 43: | ||
===TEXT_NODE=== | ===TEXT_NODE=== | ||
Node type: 3 | |||
===CDATA_SECTION_NODE=== | ===CDATA_SECTION_NODE=== |
Revision as of 01:04, 29 January 2020
External
- Document Object Model (DOM) Level 3 Core Specification https://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/
Internal
Overview
All JAXP DOM interfaces are part of the org.w3c.dom
package.
For an example of how to walk a DOM tree, see:
Document
Node
The process of navigating to a node involves processing sub-elements, ignoring the uninteresting ones and inspecting the interesting ones, recursively. A robust DOM application must do these things:
- When searching for an element
- ignore comments, attributes and processing instructions
- allow for the possibility that sub-elements do not occur in the expected order
- skip over TEXT nodes that contain ignorable white space. Warning new lines in the file are returned as text, so they have to be handled.
- When extracting text for a node:
- extract text from CDATA as well as text nodes
- ignore comments, attributes and processing instructions when gathering text
- if an entity reference node or another element node is encountered, recurse.
Node Types
A node type can be obtained with getNodeType() call, and it is one of the following:
ELEMENT_NODE
Node type: 1
ATTRIBUTE_NODE
Node type: 2
TEXT_NODE
Node type: 3
CDATA_SECTION_NODE
COMMENT_NODE
DOCUMENT_FRAGMENT_NODE
DOCUMENT_NODE
DOCUMENT_TYPE_NODE
ENTITY_NODE
ENTITY_REFERENCE_NODE
NOTATION_NODE
PROCESSING_INSTRUCTION_NODE
Node Name, Value and Attributes
Interface | nodeName | nodeValue | attributes |
Element | Element.tagName | null | NamedNodeMap |
Text | "#text" | same as CharacterData.data, the content of the text node | null |
Attr | same as Attr.name | same as Attr.value | null |
CDATASection | "#cdata-section" | same as CharacterData.data, the content of the CDATA Section | null |
Comment | "#comment" | same as CharacterData.data, the content of the comment | null |
Document | "#document" | null | null |
DocumentFragment | "#document-fragment" | null | null |
DocumentType | same as DocumentType.name | null | null |
Entity | entity name | null | null |
Notation | notation name | null | null |
ProcessingInstruction | same as ProcessingInstruction.target | same as ProcessingInstruction.data | null |
Node's Text Content
To get the text a node contains, you need to look through the list of child nodes, ignoring entries that are of no concern and accumulating the text you find in TEXT nodes, CDATA nodes, and EntityRef nodes.
Element
An Element extends a Node.