Copyright (c) Hyperion Entertainment and contributors.

Expat Library

From AmigaOS Documentation Wiki
Jump to navigation Jump to search

Introduction

Expat is a fast and resource-efficient XML parser written by James Clark. On AmigaOS it is implemented in various flavours (static link library / shared object / shared Amiga library). This documentation specifically focuses on, and provides code examples for, the shared Amiga library version.

XML Parsing Basics

As far as XML document parsing is concerned, there are basically two kinds of parsers:

  • Tree-based parsers, which process the entire XML file and build a tree structure representing the elements and other constructs in the document. An example of a tree-based parser is libxml2, which is also available for AmigaOS.
  • Stream-oriented (event-driven) parsers, which process the XML file as a continuous stream and produce an event each time the parser encounters an XML element or character data. Expat is an example of an event-driven parser.

Tree-based parsers are really comfortable to work with: the parser reconstructs the entire document structure and contents for you. You are also provided with functions to search in the document, find data, add or modify the contents etc. Event-driven parsers are, on the other hand, much more basic. They require setup and generally more work on the part of the programmer.

However, tree-based parsers are rather taxing resource-wise. Parsing the document takes longer and uses up a considerable amount of memory. Implementations also tend to be bulky: for example, the current AmigaOS static library implementation of libxml2 is bigger than 5MB – which is a preposterous file size overhead added to your program only to provide it with a parser! Event-driven parsers may offer fewer bells and whistles but they are much smaller (about 300KB in all AmigaOS Expat implementations, static or shared) and faster. In order to keep the spirit of Amiga software, you'll quite naturally want to use Expat as a well-proven and efficient parser, preferably in its shared Amiga library incarnation.

Among other things that speak in favour of event-driven parsers is the fact that when working with XML files, reconstructing the complete document tree structure is not always necessary. Quite often you're just interested in particular data that is stored in particular elements. A parser like Expat can then be used to process (react upon) events only concerning the parts of the document you're interested in. But even if you do need the entire tree structure, for whatever purpose or merely for the comfort, there is no reason to give up on Expat. As tree-based parsers are typically built on top of event-driven parsers, you can use Expat to build your own XML data representation. It means work but you can tailor the procedure to your own needs, providing perhaps less sophisticated but still adequate representation, without the extra overhead libxml2 would incur.

How To Use The Library

Depending on your particular aim and purpose, using the AmigaOS Expat Library entails at least the following basic steps (they will all be discussed in more detail further on):

  1. Open the library and obtain its interface (see Library Opening Chores below).
  2. Create two element handler functions to process XML element events.
  3. Create a character-data handler function to process XML character data events.
  4. Create a parser instance.
  5. Configure the parser so that it knows of your element and character-data handlers.
  6. Open the XML file for reading, read data from the XML file into a buffer and call the parsing function (see Parsing XML Files below).

Reading from the file is usually done in a loop. The file data is continuously fed into a fixed-size memory buffer and parsed, until the end of the document is reached. Whenever the parser encounters an element's start tag, it will call your start element handler function. Whenever it encounters an element's end tag, it invokes your end element handler function. Whenever text is found, the parser will call your character-data handler function.

As you can see, the parser itself doesn't do much: it just processes the XML file and calls the respective handler functions to react upon the individual events. All the grunt work is done in the handlers, outside of the library. Programmers design the handler functions to suit their particular needs, and process (or ignore) the received data as they find appropriate.

It should also be noted that the Expat Library provides just the parser, so it cannot be used for writing XML files.

Library Opening Chores

Just like other AmigaOS libraries, the Expat Library must be opened and its interface obtained before you can use it:

struct Library    *ExpatBase = NULL;
struct ExpatIFace *IExpat = NULL;
 
if ( (ExpatBase = IExec->OpenLibrary("expat.library", 53)) )
{
   IExpat = (struct ExpatIFace *) IExec->GetInterface(ExpatBase, "main", 1, NULL);
}
 
if ( !ExpatBase || !IExpat )
{
   /* handle library opening error */
}

Handler Functions

Handler functions are functions that get invoked automatically when the parser encounters an XML event.

Element Handlers

In XML, an element is enclosed between a start tag and an end tag. They are reported as separate events, so you need two separate functions to handle them:

The Start Element Handler

The End Element Handler

The Character-Data Handler

Parsing

Creating The Parser

Parser Configuration

Parsing XML Files

Function Reference