|
 |  |  |  | How do I create a DOM parser? |  |  |  |  |
| |
 |  |  |  | import org.apache.xerces.parsers.DOMParser;
import org.w3c.dom.Document;
import org.xml.sax.SAXException;
import java.io.IOException;
...
String xmlFile = "file:///xerces-1_0_3/data/personal.xml";
DOMParser parser = new DOMParser();
try {
parser.parse(xmlFile);
} catch (SAXException se) {
se.printStackTrace();
} catch (IOException ioe) {
ioe.printStackTrace();
}
Document document = parser.getDocument(); |  |  |  |  |
|
 |  |  |  | How do I create a SAX parser? |  |  |  |  |
| |
 |  |  |  | import org.apache.xerces.parsers.SAXParser;
import org.xml.sax.Parser;
import org.xml.sax.ParserFactory;
import org.xml.sax.SAXException;
import java.io.IOException;
...
String xmlFile = "file:///xerces-1_0_3/data/personal.xml";
String parserClass = "org.apache.xerces.parsers.SAXParser";
Parser parser = ParserFactory.makeParser(parserClass);
try {
parser.parse(xmlFile);
} catch (SAXException se) {
se.printStackTrace();
} catch (IOException ioe) {
ioe.printStackTrace();
} |  |  |  |  |
|
| | When you create a parser instance, the default error handler does nothing.
This means that your program will fail silently when it encounters an error.
You should register an error handler with the parser by supplying a class
which implements the org.xml.sax.ErrorHandler
interface. This is true regardless of whether your parser is a
DOM based or SAX based parser.
|
 |  |  |  | How do I access the DOM Level 2 functionality? |  |  |  |  |
| | The DOM Level 2
specification is at the stage of
"Candidate Recommendation" (CR), which allows feedback from implementors
before it becomes a "Recommedation". It is comprised of "core"
functionality, which is mainly the DOM
Namespaces implementation,
and a number of optional modules (called Chapters in the spec).
Please refer to:
http://www.w3.org/TR/DOM-Level-2/ for the
latest DOM Level 2 specification.
The following DOM Level 2 modules are fully implemented in Xerces-J:
-
Chapter 1: Core - most of these enhancements are for
Namespaces, and can be acessed through additional functions which
have been added directly to the org.w3c.dom.* classes.
-
Chapter 6: Events - The org.w3c.dom.events.EventTarget
interface is implemented by all
Nodes of the DOM.
The Xerces-J DOM implementation handles all of the event
triggering, capture and flow.
-
Chapter 7: Traversal - The Traversal module interfaces
are located in org.w3c.dom.traversal.
The
NodeIterator and TreeWalker , and
NodeFilter interfaces have been supplied to allow
traversal of the DOM at a higher-level. Our DOM Document
implementation class, DocumentImpl class now
implements DocumentTraversal , which supplies the
factory methods to create the iterators and treewalkers.
-
Chapter 8. Range - The Range module interfaces are
located in org.w3c.dom.range. The Range interface
allows you to specify ranges or selections using boundary
points in the DOM, along with functions (like delete,
clone, extract..) that can be performed on these ranges.
Our DOM Document implementation class,
DocumentImpl
class now implements DocumentRange , that supplies
the factory method to create a Range .
 | Since the DOM Level 2 is still in the CR phase, some changes
to these specs are still possible. The purpose of this phase is to
provide feedback to the W3C, so that the specs can be clarified and
implementation concerns can be addressed. |
|
 |  |  |  | How do I read data from a stream as it arrives? |  |  |  |  |
| | For performance reasons, all the standard Xerces processing
uses readers which buffer the input. In order to read data
from a stream as it arrives, you need to instruct Xerces to
use the StreamingCharReader class as its reader.
To do this, create a subclass of
org.apache.xerces.readers.DefaultReaderFactory
and override createCharReader and
createUTF8Reader as shown below.
 |  |  |  | public class StreamingCharFactory extends org.apache.xerces.readers.DefaultReaderFactory {
public XMLEntityHandler.EntityReader createCharReader(XMLEntityHandler entityHandler,
XMLErrorReporter errorReporter,
boolean sendCharDataAsCharArray,
Reader reader,
StringPool stringPool) throws Exception
{
return new org.apache.xerces.readers.StreamingCharReader(entityHandler,
errorReporter, sendCharDataAsCharArray, reader, stringPool);
}
public XMLEntityHandler.EntityReader createUTF8Reader(XMLEntityHandler entityHandler,
XMLErrorReporter errorReporter,
boolean sendCharDataAsCharArray,
InputStream data,
StringPool stringPool) throws Exception
{
XMLEntityHandler.EntityReader reader;
reader = new org.apache.xerces.readers.StreamingCharReader(entityHandler,
errorReporter, sendCharDataAsCharArray,
new InputStreamReader(data, "UTF8"), stringPool);
return reader;
}
} |  |  |  |  |
In your program, after you instantiate a parser class, replace
the DefaultReaderFactory with StreamingCharFactory , and be
sure to wrap the InputStream that you are reading
from with an InputStreamReader .
 |  |  |  | InputStream in = ... ;
SAXParser p = new SAXParser();
DocumentHandler h = ... ;
// set the correct reader factory
p.setReaderFactory(((StreamingSAXClient)h).new StreamingCharFactory());
p.setDocumentHandler(h);
// be sure to wrap the input stream in an InputStreamReader.
p.parse(new InputSource(new InputStreamReader(in))); |  |  |  |  |
|
|