JAXP SAX: Difference between revisions
(4 intermediate revisions by the same user not shown) | |||
Line 12: | Line 12: | ||
SAX parsers can only be used for state-independent processing, where the handling of an element does not depend on elements that came before, unlike [[JAXP StAX|StAX]], which can be used for state-dependent processing. | SAX parsers can only be used for state-independent processing, where the handling of an element does not depend on elements that came before, unlike [[JAXP StAX|StAX]], which can be used for state-dependent processing. | ||
SAX is a read-only API, XML documents can only be read with SAX, not written. | |||
Start by generating a parser instance with <tt>SAXParserFactory</tt>. The actual implementation of the parser implements the <tt>SAXParser</tt> interface and is determined by the value of <tt>javax.xml.parsers.SAXParserFactory</tt> system property. Then call parsers's parse() method. The parser contains a <tt>SAXReader</tt> instance, which invokes callback methods the application must implement. The methods are defined by the <tt>ContentHandler</tt>, <tt>ErrorHandler</tt>, <tt>DTDHandler</tt> and <tt>EntityResolver</tt> interfaces. | Start by generating a parser instance with <tt>SAXParserFactory</tt>. The actual implementation of the parser implements the <tt>SAXParser</tt> interface and is determined by the value of <tt>javax.xml.parsers.SAXParserFactory</tt> system property. Then call parsers's parse() method. The parser contains a <tt>SAXReader</tt> instance, which invokes callback methods the application must implement. The methods are defined by the <tt>ContentHandler</tt>, <tt>ErrorHandler</tt>, <tt>DTDHandler</tt> and <tt>EntityResolver</tt> interfaces. | ||
When an XML tag is recognized, the parser invokes the corresponding methods (<tt>startDocument()</tt>, <tt>startElement()</tt>), ...) on the <tt>ContentHandler</tt> implementation. | When an XML tag is recognized, the parser invokes the corresponding methods (<tt>startDocument()</tt>, <tt>startElement()</tt>), ...) on the <tt>ContentHandler</tt> implementation. | ||
SAX parsers have low memory requirements, as they don't construct an internal representation of the XML document. | SAX parsers have low memory requirements, as they don't construct an internal representation of the XML document. | ||
Line 24: | Line 24: | ||
For a working example of SAX parsing, see [[SAX Examples]] below. | For a working example of SAX parsing, see [[SAX Examples]] below. | ||
==Error Handling== | |||
The <tt>ErrorHandler</tt> implementation is messaged on various parsing errors. The default implementation is rudimentary, if a more nuanced behavior is necessary, the handler must be implemented. Note that [[JAXP DOM]] and SAX parsers handle errors in a similar manner, the same exceptions are generated so the error handling code is virtually identical. | |||
==Difference between Pull Parsing and Push Parsing== | ==Difference between Pull Parsing and Push Parsing== | ||
[[Difference between Pull Parsing and Push Parsing]] | <blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;"> | ||
:[[Difference between Pull Parsing and Push Parsing]] | |||
</blockquote> | |||
=SAX Examples= | =SAX Examples= |
Latest revision as of 02:27, 11 November 2016
External
Internal
Overview
A SAX parser implements event-driven, serial-access push parsing. It uses a streaming model.
SAX parsers can only be used for state-independent processing, where the handling of an element does not depend on elements that came before, unlike StAX, which can be used for state-dependent processing.
SAX is a read-only API, XML documents can only be read with SAX, not written.
Start by generating a parser instance with SAXParserFactory. The actual implementation of the parser implements the SAXParser interface and is determined by the value of javax.xml.parsers.SAXParserFactory system property. Then call parsers's parse() method. The parser contains a SAXReader instance, which invokes callback methods the application must implement. The methods are defined by the ContentHandler, ErrorHandler, DTDHandler and EntityResolver interfaces.
When an XML tag is recognized, the parser invokes the corresponding methods (startDocument(), startElement()), ...) on the ContentHandler implementation.
SAX parsers have low memory requirements, as they don't construct an internal representation of the XML document.
The SAX parser provide access to the original document location information (line and column), via the Locator injected into the ContentHandler. For an example, see Location in an XML document.
For a working example of SAX parsing, see SAX Examples below.
Error Handling
The ErrorHandler implementation is messaged on various parsing errors. The default implementation is rudimentary, if a more nuanced behavior is necessary, the handler must be implemented. Note that JAXP DOM and SAX parsers handle errors in a similar manner, the same exceptions are generated so the error handling code is virtually identical.
Difference between Pull Parsing and Push Parsing
SAX Examples
Component Packages
- javax.xml.parsers defines SAXParserFactory and exception classes.
- org.xml.sax the basic SAX interfaces.