Copyright (c) Hyperion Entertainment and contributors.

Parsing IFF

From AmigaOS Documentation Wiki
Jump to navigation Jump to search

Parsing

This is both simple and complicated. It’s simple in that it’s just one call. It’s complicated in that you have to seize control of the parser to get your data.

The parser operates automatically, scanning the file, verifying syntax and layout rules. If left to its default behavior, it will scan through the entire file until it reaches the end, whereupon it will tell you that it got to the end.

The whole scanning procedure is controlled through one call:

error = IIFFParse->ParseIFF(iff, controlmode);

The control modes are IFFPARSE_SCAN, IFFPARSE_STEP and IFFPARSE_RAWSTEP. For now, only the IFFPARSE_SCAN control mode is considered.

Controlling Parsing

ParseIFF(), if left to itself, wouldn’t do anything useful. Ideally, it should stop at strategic places so we can look at the chunks. Here’s where it can get complicated.

There are many functions provided to help control the parsing process; only the common ones are covered here.

StopChunk()

You can instruct the parser to stop when it encounters a specific IFF chunk by using the function StopChunk():

error = IIFFParse->StopChunk(iff, ID_ILBM, ID_BODY);

When the parser encounters the requested chunk, parsing will stop, and ParseIFF() will return the value zero. The stream will be positioned ready to read the first data byte in the chunk. You may then call ReadChunkBytes() or ReadChunkRecords() to pull the data out of the chunk.

You may call StopChunk() any number of times for any number of different chunk types. If you wish to identify the chunk on which you’ve stopped, you may call CurrentChunk() to get a pointer to the current ContextNode, and examine the cn_Type and cn_ID fields.

Using StopChunk() for every chunk, you can parse IFF files in a manner very similar to the way you’re probably doing it now, using a state machine. However, this would be a terrible underuse of IFFParse.

PropChunk()/FindProp()

In the case of a FORM ILBM, certain chunks are defined as being able to appear in any order. Among these are the BMHD, CMAP, and CAMG. Typically, BMHD appears first, followed by CMAP and CAMG, but you can’t make this assumption. The IFF and ILBM standards require you to assume these chunks will appear in any order. So ideally, what you’d like to do is collect them as they arrive, but not do anything with them until you actually need them.

This is where PropChunk() comes in. The syntax for PropChunk() is identical to StopChunk():

error = IIFFParse->PropChunk(iff, ID_ILBM, ID_BMHD);

When you call ParseIFF(), the parser will look for chunks declared with PropChunk(). When it sees them, the parser will internally copy the contents of the chunk into memory for you before continuing its parsing.

When you’re ready to examine the contents of the chunk, you use the function FindProp():

StoredProperty = IIFFParse->FindProp (iff, ID_ILBM, ID_BMHD);

FindProp() returns a pointer to a struct StoredProperty, which contains the chunk size and data. If the chunk was never encountered, NULL is returned. This permits you to process the property chunks in any order you wish, regardless of how they appeared in the file. This provides much better control of data interpretation and also reduces headaches. The following fragment shows how ILBM BitMapHeader data could be accessed after using ParseIFF() with PropChunk(iff, ID_ILBM, ID_BMHD):

struct StoredProperty *sp;      /* defined in iffparse.h */
struct BitMapHeader *bmhd;      /* defined in IFF spec   */
 
if (sp = IIFFParse->FindProp(iff, ID_ILBM, ID_BMHD))
        {
        /* If property is BMHD, sp->sp_Data is ptr to data in BMHD */
        bmhd = (struct BitMapHeader *)sp->sp_Data;
        IDOS->Printf("BMHD: PageWidth      = %ld\n", bmhd->PageWidth);
        }

Putting it Together

With just StopChunk(), PropChunk(), and ParseIFF(), you can write a viable ILBM display program. Since IFFParse knows all about IFF structure and scoping, such a display program would have the added ability to parse complex FORMs, LISTs, and CATs and attempt to find imagery buried within.

Such an ILBM reader might appear as follows:

iff = IIFFParse->AllocIFF();
iff->iff_Stream = IDOS->Open ("shuttle dog", MODE_OLDFILE);
IIFFParse->InitIFFasDOS (iff);
IIFFParse->OpenIFF (iff, IFFF_READ);
 
IIFFParse->PropChunk (iff, ID_ILBM, ID_BMHD);
IIFFParse->PropChunk (iff, ID_ILBM, ID_CMAP);
IIFFParse->PropChunk (iff, ID_ILBM, ID_CAMG);
IIFFParse->StopChunk (iff, ID_ILBM, ID_BODY);
IIFFParse->ParseIFF (iff, IFFPARSE_SCAN);
 
 
if (bmhdprop = IIFFParse->FindProp (iff, ID_ILBM, ID_BMHD))
    configurescreen (bmhdprop);
else
    bye ("No BMHD, no picture.");
 
 
if (cmapprop = IIFFParse->FindProp (iff, ID_ILBM, ID_CMAP))
    setcolors (cmapprop);
else
    usedefaultcolors ();
 
 
if (camgprop = IIFFParse->FindProp (iff, ID_ILBM, ID_CAMG))
    setdisplaymodes (camgprop);
 
 
decodebody (iff);
showpicture ();
IIFFParse->CloseIFF (iff);
IDOS->Close (iff->iff_Stream);
IIFFParse->FreeIFF (iff);
Open the Library
Application programs must always open iffparse.library before using the functions outlined above.
Only Example Programs Skip Error Checking
Error checking is not used in the example above for the sake of clarity. A real application should always check for errors.

Other Chunk Management Functions

Several other functions are available for controlling the parser.

CollectionChunk() and FindCollection()

PropChunk() keeps only one copy of the declared chunk (the one currently in scope). CollectionChunk() collects and keeps all instances of a specified chunk. This is useful for chunks such as the ILBM CRNG chunk, which can appear multiple times in a FORM, and which don’t override previous instances of themselves. CollectionChunk() is called identically to PropChunk():

error = IIFFParse->CollectionChunk(iff, type, id);

When you wish to find the collected chunks currently in scope, you use the function FindCollection():

ci = IIFFParse->FindCollection(iff, type, id);

You will be returned a pointer to a CollectionItem, which is part of a singly-linked list of all copies of the specified chunk collected so far that are currently in scope.

struct CollectionItem {
        struct CollectionItem   *ci_Next;
        LONG                    ci_Size;
        UBYTE                   *ci_Data;
};

The size of this copy of the chunk is in the CollectionItem’s ci_Size field. The ci_Data field points to the chunk data itself. The ci_Next field points to the next CollectionItem in the list. The last element in the list has ci_Next set to NULL.

The most recently-encountered instance of the chunk will be first in the list, followed by earlier chunks. Some might consider this ordering backwards.

If NULL is returned, the specified chunk was never encountered.

StopOnExit()

Whereas StopChunk() will stop the parser just as it enters the declared chunk, StopOnExit() will stop just before it leaves the chunk. This is useful for finding the end of FORMs, which would indicate that you’ve collected all possible data in this FORM and may now act on it.

/* Ask ParseIFF() to stop with IFFERR_EOC when leaving a FORM ILBM */
IIFFParse->StopOnExit(iff, ID_ILBM, ID_FORM);

EntryHandler()

This is used to install your own custom chunk entry handler. See the “Custom Chunk Handlers” section below for more information. StopChunk(), PropChunk(), and CollectionChunk() are internally built on top of this.

ExitHandler()

This is used to install your own custom chunk exit handler. See the “Custom Chunk Handlers” section below for more information. StopOnExit() is internally built on top of this.

Reading Chunk Data

To read data from a chunk, use the functions ReadChunkBytes() and ReadChunkRecords(). Both calls truncate attempts to read past the end of a chunk. For odd-length chunks, the parser will skip over the pad bytes for you. Remember that for chunks which have been gathered using PropChunk() (or CollectionChunk() ), you may directly reference the data by using FindProp() (or FindCollection() ) to get a pointer to the data. ReadChunkBytes() is commonly used when loading and decompressing bitmap and sound sample data or sequentially reading in data chunks such as FTXT CHRS text chunks. See the code listing “ClipFTXT.c” for an example usage of ReadChunkBytes().

Other Parsing Modes

In addition to the mode IFFPARSE_SCAN, there are the modes IFFPARSE_STEP and IFFPARSE_RAWSTEP.

IFFPARSE_RAWSTEP

This mode causes the parser to progress through the stream step by step, rather than in the automated fashion provided by IFFPARSE_SCAN. In this mode, ParseIFF() will return upon every entry to and departure from a context.

When the parser enters a context, ParseIFF() will return zero. CurrentChunk() will report the type and ID of the chunk just entered, and the stream will be positioned to read the first byte in the chunk. When entering a FORM, LIST, CAT or PROP chunk, the longword containing the type (e.g., ILBM, FTXT, etc.) is read by the parser. In this case, the stream will be positioned to read the byte immediately following the type.)

When the parser leaves a context, ParseIFF() will return the value IFFERR_EOC. This is not strictly an error, but an indication that you are about to leave the current context. CurrentChunk() will report the type and ID of the chunk you are about to leave. The stream is not positioned predictably within the chunk.

The parser does not call any installed chunk handlers when using this mode (e.g., property chunks declared with PropChunk() will not be collected).

See the example program, “Sift.c”, for a demonstration of IFFPARSE_RAWSTEP.

IFFPARSE_STEP

This mode is identical to IFFPARSE_RAWSTEP, except that, before control returns to your code, the chunk handler (if any) for the chunk is invoked.