JAXP SAX: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(13 intermediate revisions by the same user not shown)
Line 9: Line 9:
=Overview=
=Overview=


A SAX parser implements event-driven, push parsing.
A SAX parser implements event-driven, serial-access push parsing. It uses a ''streaming'' model.


To use a SAX parser, use <tt>SAXParserFactory</tt> to generate a parser instance. The actual implementation of the parser implements the <tt>SAXParser</tt> interface and is determined by the value of <tt>javax.xml.parsers.SAXParserFactory</tt> system property. Then call parsers's parse() method. The parser contains a <tt>SAXReader</tt> instance, which invokes callback methods the application must implement. The methods are defined by the <tt>ContentHandler</tt>, <tt>ErrorHandler</tt>, <tt>DTDHandler</tt> and <tt>EntityResolver</tt> interfaces.
SAX parsers can only be used for state-independent processing, where the handling of an element does not depend on elements that came before, unlike [[JAXP StAX|StAX]], which can be used for state-dependent processing.
 
SAX is a read-only API, XML documents can only be read with SAX, not written.
 
Start by generating a parser instance with <tt>SAXParserFactory</tt>. The actual implementation of the parser implements the <tt>SAXParser</tt> interface and is determined by the value of <tt>javax.xml.parsers.SAXParserFactory</tt> system property. Then call parsers's parse() method. The parser contains a <tt>SAXReader</tt> instance, which invokes callback methods the application must implement. The methods are defined by the <tt>ContentHandler</tt>, <tt>ErrorHandler</tt>, <tt>DTDHandler</tt> and <tt>EntityResolver</tt> interfaces.


When an XML tag is recognized, the parser invokes the corresponding methods (<tt>startDocument()</tt>, <tt>startElement()</tt>), ...) on the <tt>ContentHandler</tt> implementation.  
When an XML tag is recognized, the parser invokes the corresponding methods (<tt>startDocument()</tt>, <tt>startElement()</tt>), ...) on the <tt>ContentHandler</tt> implementation.  


The <tt>ErrorHandler</tt> implementation is messaged on various parsing errors. The default implementation is rudimentary, if a more nuanced behavior is necessary, the handler must be implemented.
SAX parsers have low memory requirements, as they don't construct an internal representation of the XML document.
 
The SAX parser provide access to the original document location information (line and column), via the <tt>Locator</tt> injected into the <tt>ContentHandler</tt>. For an example, see [[SAX_Examples#Location_in_an_XML_Document|Location in an XML document]].
 
For a working example of SAX parsing, see [[SAX Examples]] below.
 
==Error Handling==
 
The <tt>ErrorHandler</tt> implementation is messaged on various parsing errors. The default implementation is rudimentary, if a more nuanced behavior is necessary, the handler must be implemented. Note that [[JAXP DOM]] and SAX parsers handle errors in a similar manner, the same exceptions are generated so the error handling code is virtually identical.
 
==Difference between Pull Parsing and Push Parsing==


For a working example of SAX parsing, see [[SAX Example]] below.
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[Difference between Pull Parsing and Push Parsing]]
</blockquote>


=SAX Example=
=SAX Examples=


<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[SAX Example]]
:[[SAX Examples]]
</blockquote>
</blockquote>



Latest revision as of 02:27, 11 November 2016

External

Internal

Overview

A SAX parser implements event-driven, serial-access push parsing. It uses a streaming model.

SAX parsers can only be used for state-independent processing, where the handling of an element does not depend on elements that came before, unlike StAX, which can be used for state-dependent processing.

SAX is a read-only API, XML documents can only be read with SAX, not written.

Start by generating a parser instance with SAXParserFactory. The actual implementation of the parser implements the SAXParser interface and is determined by the value of javax.xml.parsers.SAXParserFactory system property. Then call parsers's parse() method. The parser contains a SAXReader instance, which invokes callback methods the application must implement. The methods are defined by the ContentHandler, ErrorHandler, DTDHandler and EntityResolver interfaces.

When an XML tag is recognized, the parser invokes the corresponding methods (startDocument(), startElement()), ...) on the ContentHandler implementation.

SAX parsers have low memory requirements, as they don't construct an internal representation of the XML document.

The SAX parser provide access to the original document location information (line and column), via the Locator injected into the ContentHandler. For an example, see Location in an XML document.

For a working example of SAX parsing, see SAX Examples below.

Error Handling

The ErrorHandler implementation is messaged on various parsing errors. The default implementation is rudimentary, if a more nuanced behavior is necessary, the handler must be implemented. Note that JAXP DOM and SAX parsers handle errors in a similar manner, the same exceptions are generated so the error handling code is virtually identical.

Difference between Pull Parsing and Push Parsing

Difference between Pull Parsing and Push Parsing

SAX Examples

SAX Examples

Component Packages