Copyright (c) Hyperion Entertainment and contributors.

Difference between revisions of "DCFS and LNFS Low Level Data Structures"

From AmigaOS Documentation Wiki
Jump to navigation Jump to search
 
(8 intermediate revisions by the same user not shown)
Line 26: Line 26:
 
== Data structure changes and new data structures ==
 
== Data structure changes and new data structures ==
   
The DCFS mode reuses the Amiga file system data structures which are utilized
+
The DCFS mode reuses the Amiga file system data structures which are utilized in the original file system (OFS) and the Fast File System (FFS) and builds upon them by adding directory lists blocks. The directory list blocks are attached to the root directory and all user directories through a field which was, up to 1992, unused.
in the original file system (OFS) and the Fast File System (FFS) and builds
 
upon them by adding directory lists blocks. The directory list blocks are
 
attached to the root directory and all user directories through a field
 
which was, up to 1992, unused.
 
   
 
Standard root and user directory blocks (512 bytes) for OFS and FFS would look like this:
 
Standard root and user directory blocks (512 bytes) for OFS and FFS would look like this:
Line 166: Line 162:
 
== How directory lists are constructed ==
 
== How directory lists are constructed ==
   
The root directory and each user directory must contain a valid directory list.
+
The root directory and each user directory must contain a valid directory list. The RootBlock.Reserved4 and the UserDirectoryBlock.Reserved8 fields, respectively, contain the block number of the first directory list block for the root block/directory. This block number must never be 0.
The RootBlock.Reserved4 and the UserDirectoryBlock.Reserved8 fields, respectively,
 
contain the block number of the first directory list block for the
 
root block/directory. This block number must never be 0.
 
   
The FileListBlock.Type field must be set to TYPE_DIRLIST_REV1 (33). While the Amiga
+
The FileListBlock.Type field must be set to TYPE_DIRLIST_REV1 (33). While the Amiga file system for Kickstart 3.0 was under development, Commodore briefly supported a different type of directory list which can be identified as type TYPE_DIRLIST_REV0 (32). Details of the differences between TYPE_DIRLIST_REV1 and TYPE_DIRLIST_REV0 will be described later in the context of the directory list entry format.
file system for Kickstart 3.0 was under development, Commodore briefly supported
 
a different type of directory list which can be identified as type
 
TYPE_DIRLIST_REV0 (32). Details of the differences between TYPE_DIRLIST_REV1
 
and TYPE_DIRLIST_REV0 will be described later in the context of the directory list
 
entry format.
 
   
Each directory list block is part of a chain, in which the FileListBlock.NextBlock
+
Each directory list block is part of a chain, in which the FileListBlock.NextBlock field contains the block number of the next following entry in this chain, or 0 to indicate that no other directory list block follows.
field contains the block number of the next following entry in this chain, or 0
 
to indicate that no other directory list block follows.
 
   
The number of entries in this directory list block is stored in
+
The number of entries in this directory list block is stored in FileListBlock.NumEntries. A value of 0 indicates that this block has no
FileListBlock.NumEntries. A value of 0 indicates that this block has no
 
 
directory entries stored.
 
directory entries stored.
   
 
== How directory list entries are constructed for TYPE_DIRLIST_REV1 (33) ==
 
== How directory list entries are constructed for TYPE_DIRLIST_REV1 (33) ==
   
If the FileListBlock.Type field is set to TYPE_DIRLIST_REV1 (33) then you
+
If the FileListBlock.Type field is set to TYPE_DIRLIST_REV1 (33) then you are dealing with the standard form of the directory list block. This is the only form which should be used in production code.
are dealing with the standard form of the directory list block. This is
 
the only form which should be used in production code.
 
   
 
Directory list entries directly follow the directory list block header.
 
Directory list entries directly follow the directory list block header.
   
  +
Each entry consists of a constant size portion (FileListEntry), followed by a name and comment string in BCPL format (i.e. a single length byte is followed by exactly as many characters as indicated by the contents of the length byte; there is no terminating NUL byte). If the total size of the constant part, the name and the comment string is an odd number, then a padding byte must be added to the entry. The padding byte should be set to 0.
Each entry consists of a constant size portion (FileListEntry), followed by a
 
name and comment string in BCPL format (i.e. a single length byte is followed
 
by exactly as many characters as indicated by the contents of the length byte;
 
there is no terminating NUL byte). If the total size of the constant part,
 
the name and the comment string is an odd number, then a padding byte must
 
be added to the entry. The padding byte should be set to 0.
 
   
  +
Because the total size of the FileListEntry is not stored within itself, it must be calculated by looking at how long the name and comment data is, including the length byte which introduces each string. If the number obtained by adding the size of the FileListEntry and the name and comment strings (in BCPL format) is odd, 1 must be added to make it an even number (accounting for the required padding byte).
Because the total size of the FileListEntry is not stored within itself,
 
it must be calculated by looking at how long the name and comment data is,
 
including the length byte which introduces each string. If the number
 
obtained by adding the size of the FileListEntry and the name and
 
comment strings (in BCPL format) is odd, 1 must be added to make it an
 
even number (accounting for the required padding byte).
 
   
The number of directory list entries must match the value of the
+
The number of directory list entries must match the value of the directory list block FileListBlock.NumEntries field.
directory list block FileListBlock.NumEntries field.
 
   
If there is enough spare room left in the directory list block,
+
If there is enough spare room left in the directory list block, then the list of directory list entries must be terminated by an ULONG value of 0. This looks like a partial FileListEntry with the Key field set to 0.
then the list of directory list entries must be terminated by an
 
ULONG value of 0. This looks like a partial FileListEntry with
 
the Key field set to 0.
 
   
The contents of FileListEntry.Created field are a truncated form of
+
The contents of FileListEntry.Created field are a truncated form of the 'struct DateStamp' which can be found, for example in the
  +
UserDirectoryBlock.Created field. The seconds and microseconds fit completely within the truncated form, but the number of days does not.
the 'struct DateStamp' which can be found, for example in the
 
  +
As such the date range is restricted to 179 years instead of 4 billion years.
UserDirectoryBlock.Created field. The seconds and microseconds fit
 
completely within the truncated form, but the number of days does not.
 
As such the date range is restricted to 179 years instead of 4 billion
 
years.
 
   
 
== How directory list entries are constructed for TYPE_DIRLIST_REV0 (32) ==
 
== How directory list entries are constructed for TYPE_DIRLIST_REV0 (32) ==
   
If the FileListBlock.Type field is set to TYPE_DIRLIST_REV0 (32) then you
+
If the FileListBlock.Type field is set to TYPE_DIRLIST_REV0 (32) then you are dealing with a non-standard form of the directory list block which was the precursor to the standard form. Never use the TYPE_DIRLIST_REV0 (32) form in production code. The following documentation is provided for the sake of completeness only.
are dealing with a non-standard form of the directory list block which was
 
the precursor to the standard form. Never use the TYPE_DIRLIST_REV0 (32)
 
form in production code. The following documentation is provided for the
 
sake of completeness only.
 
   
If the FileListBlock.Type field is set to TYPE_DIRLIST_REV0 then the format
+
If the FileListBlock.Type field is set to TYPE_DIRLIST_REV0 then the format of the directory list entries is slightly different:
of the directory list entries is slightly different:
 
   
 
<syntaxhighlight>
 
<syntaxhighlight>
Line 253: Line 214:
 
</syntaxhighlight>
 
</syntaxhighlight>
   
In revision 0 the OwnerXID field is omitted from the constant size
+
In revision 0 the OwnerXID field is omitted from the constant size directory list entry header.
directory list entry header.
 
   
 
== Limitations and drawbacks ==
 
== Limitations and drawbacks ==
   
Because a copy of the metadata of the directory contents is made, this copy needs
+
Because a copy of the metadata of the directory contents is made, this copy needs to be updated whenever the original changes. This alone makes DCFS mode slower than the comparable OFS/FFS variant when directory contents are modified.
to be updated whenever the original changes. This alone makes DCFS mode slower
 
than the comparable OFS/FFS variant when directory contents are modified.
 
   
  +
Additional complications may arise if a name or comment is changed so that the corresponding directory list entry becomes too large to fit into its directory list block. In this case additional storage space will need to be allocated for a new directory list block, the old block will have to be rewritten, the new block will have to be initialized and linked up. If no storage space is available, the rename/comment change operation will have to be undone. Both the successful case (new entry created) and the failing case (not enough store
Additional complications may arise if a name or comment is changed so that the
 
corresponding directory list entry becomes too large to fit into its
 
directory list block. In this case additional storage space will need to be
 
allocated for a new directory list block, the old block will have to be rewritten,
 
the new block will have to be initialized and linked up. If no storage space
 
is available, the rename/comment change operation will have to be undone. Both
 
the successful case (new entry created) and the failing case (not enough store
 
 
space, undoing the change) are making a directory update significantly more complex.
 
space, undoing the change) are making a directory update significantly more complex.
   
Maintaining the consistency of the directory cache is the responsibility of the
+
Maintaining the consistency of the directory cache is the responsibility of the file system validation process. If the volume needs to be rebuilt, two passes are necessary in order to first find out which data blocks are used, and then to rebuild the directory cache. This will easily double the total validation time.
file system validation process. If the volume needs to be rebuilt, two passes are
 
necessary in order to first find out which data blocks are used, and then to
 
rebuild the directory cache. This will easily double the total validation time.
 
   
  +
Due to the complexity of maintaining the directory cache, any update takes significantly longer than in the comparable OFS/FFS variants of the file system, and it modifies more data structures. During this operation the file system data structures are not entirely consistent at all times, and if the system should crash or the power should fail, then a validation may stall because of inconsistencies. The file system will have to be repaired. The DCFS mode makes this case more likely to appear and then adds insult to injury by making the validation process at least twice as long as with the comparable OFS/FFS variants of the file system.
Due to the complexity of maintaining the directory cache, any update takes
 
significantly longer than in the comparable OFS/FFS variants of the file
 
system, and it modifies more data structures. During this operation the file
 
system data structures are not entirely consistent at all times, and if the
 
system should crash or the power should fail, then a validation may stall
 
because of inconsistencies. The file system will have to be repaired. The
 
DCFS mode makes this case more likely to appear and then adds insult to
 
injury by making the validation process at least twice as long as with
 
the comparable OFS/FFS variants of the file system.
 
   
  +
There is no "garbage collection" process for the directory cache. If most of the entries are removed from a directory list block no merging of neighbouring directory list blocks takes place. If all entries are removed from a directory list block, that block is not removed from the directory list and its storage space is not reclaimed. Only deleting the empty directory which contains the empty directory list chain will free that memory.
There is no "garbage" collection process for the directory cache. If
 
most of the entries are removed from a directory list block no merging
 
of neighbouring directory list blocks takes place. If all entries are
 
removed from a directory list block, that block is not removed from
 
the directory list and its storage space is not reclaimed. Only deleting
 
the empty directory which contains the empty directory list chain will
 
free that memory.
 
   
 
= Long Name File System (LNFS) =
 
= Long Name File System (LNFS) =
Line 298: Line 233:
 
== History and purpose ==
 
== History and purpose ==
   
The LNFS mode was introduced early during development of the FFS reimplementation which eventually shipped with AmigaOS 4.0 and MorphOS. These changes were introduced by Olaf Barthel, who created the FFS reimplementation in 2001.
+
The LNFS mode was introduced early during development of the FFS reimplementation which eventually shipped with AmigaOS 4. These changes were introduced by Olaf Barthel, who created the FFS reimplementation in 2001.
   
The Amiga operating system has always supported long file names in the dos.library
+
The Amiga operating system has always supported long file names in the dos.library functions which retrieved the contents of directories (107 characters per entry). The Kickstart 3.0 ExAll() API function did not even place an upper limit on the length of a directory entry name. However, the default ROM file system only allowed for up to 30 characters per name to be used.
functions which retrieved the contents of directories (107 characters per entry).
 
The Kickstart 3.0 ExAll() API function did not even place an upper limit on the
 
length of a directory entry name. However, the default ROM file system only allowed
 
for up to 30 characters per name to be used.
 
   
The LNFS mode removes the 30 character limitation for the names of files, directories,
+
The LNFS mode removes the 30 character limitation for the names of files, directories, hard links and soft links. The root directory name is, however, still restricted to 30 characters.
hard links and soft links. The root directory name is, however, still restricted to
 
30 characters.
 
   
  +
There is an additional feature in LNFS which is intended to improve performance when an LNFS volume is mounted. The file system no longer requires that the bookkeeping data structure (the "bitmap") which tracks the blocks currently allocated for data to be scanned completely before a read or write access can be made. Reading the entire bitmap just for the purpose of counting the number of bits set is costly. LNFS stores the number of blocks used as a counter value in the root directory. This counter will be updated for each change made to the bitmap, and should a validation be required, then the counter will be updated to reflect the status of the bitmap after validation.
There is an additional feature in LNFS which is intended to improve performance when
 
an LNFS volume is mounted. The file system no longer requires that the bookkeeping
 
data structure (the "bitmap") which tracks the blocks currently allocated for data
 
to be scanned completely before a read or write access can be made. Reading the entire
 
bitmap just for the purpose of counting the number of bits set is costly. LNFS stores
 
the number of blocks used as a counter value in the root directory. This counter will
 
be updated for each change made to the bitmap, and should a validation be required,
 
then the counter will be updated to reflect the status of the bitmap after validation.
 
   
 
== Data structure changes and new data structures ==
 
== Data structure changes and new data structures ==
   
The long file name support was added by changing the layout of the
+
The long file name support was added by changing the layout of the canonical file/directory header block and by introducing a new block
  +
type. To illustrate, here is the default layout of a directory entry with 512 bytes per block; note the section enclosed in dashes:
canonical file/directory header block and by introducing a new block
 
type. To illustrate, here is the default layout of a directory entry
 
with 512 bytes per block; note the section enclosed in dashes:
 
   
 
<syntaxhighlight>
 
<syntaxhighlight>
Line 368: Line 288:
 
</syntaxhighlight>
 
</syntaxhighlight>
   
The section between the dashes is common to all directory entries,
+
The section between the dashes is common to all directory entries, i.e. directories, files, hard links and soft links. Each of these
  +
blocks contains a comment, a modification/creation date stamp and a name.
i.e. directories, files, hard links and soft links. Each of these
 
blocks contains a comment, a modification/creation date stamp and a
 
name.
 
   
A little research revealed that comments are used rather rarely with
+
A little research revealed that comments are used rather rarely with the Amiga file system. At best, special applications such as web
  +
browsers or FTP clients use the comment field to store additional information about cache entries or the origin of downloaded files.
the Amiga file system. At best, special applications such as web
 
  +
While the comment field is definitely useful, the space it occupies would be better put to use to allow for longer file names. Here is how
browsers or FTP clients use the comment field to store additional
 
information about cache entries or the origin of downloaded files.
 
While the comment field is definitely useful, the space it occupies
 
would be better put to use to allow for longer file names. Here is how
 
 
the layout of the block was changed to allow for longer file names:
 
the layout of the block was changed to allow for longer file names:
   
Line 421: Line 336:
 
</syntaxhighlight>
 
</syntaxhighlight>
   
As you can see, the name and comment fields were merged and the date
+
As you can see, the name and comment fields were merged and the date stamp pushed down a few long words.
stamp pushed down a few long words.
 
   
In the new layout the 'NaC' entry contains both the name and the
+
In the new layout the 'NaC' entry contains both the name and the comment, as two BCPL strings. The name and comment are stored that way
  +
only if both will fit into the 112 bytes available. For example, a directory by the name of "barney" with a comment "fred" will have the
comment, as two BCPL strings. The name and comment are stored that way
 
  +
following bytes stored in the 'NaC' field (the numbers in brackets are the respective BCPL string length bytes):
only if both will fit into the 112 bytes available. For example, a
 
directory by the name of "barney" with a comment "fred" will have the
 
following bytes stored in the 'NaC' field (the numbers in brackets are
 
the respective BCPL string length bytes):
 
   
 
[6] b a r n e y [4] f r e d
 
[6] b a r n e y [4] f r e d
   
By comparison, a directory by the name of "wilma" with no comment
+
By comparison, a directory by the name of "wilma" with no comment stored in the directory block would have the following bytes in the
stored in the directory block would have the following bytes in the
 
 
'NaC' field:
 
'NaC' field:
   
 
[5] w i l m a [0]
 
[5] w i l m a [0]
   
If the comment string stored in this character array is empty, it does
+
If the comment string stored in this character array is empty, it does not necessarily mean that the respective directory has no comment
not necessarily mean that the respective directory has no comment
+
attached. It might mean that the name and the comment do not both fit into the directory entry block and that the comment is stored
  +
separately in its own block. If the latter is the case, you will find the number of the comment block in the 'CommentBlock' field; if this
attached. It might mean that the name and the comment do not both fit
 
  +
number is zero, then there is no comment block associated with this directory entry. The comment block's layout looks like this:
into the directory entry block and that the comment is stored
 
separately in its own block. If the latter is the case, you will find
 
the number of the comment block in the 'CommentBlock' field; if this
 
number is zero, then there is no comment block associated with this
 
directory entry. The comment block's layout looks like this:
 
   
 
<syntaxhighlight>
 
<syntaxhighlight>
Line 466: Line 372:
 
</syntaxhighlight>
 
</syntaxhighlight>
   
There would be more room for the comment to be stored, but
+
There would be more room for the comment to be stored, but unfortunately, the AmigaDOS API allows only for up to 79 characters to
  +
be used for a comment string, as returned using the 'ExAll()' and 'Examine()/ExNext()' routines.
unfortunately, the AmigaDOS API allows only for up to 79 characters to
 
be used for a comment string, as returned using the 'ExAll()' and
 
'Examine()/ExNext()' routines.
 
   
Note that the root directory itself is unaffected by the changes
+
Note that the root directory itself is unaffected by the changes introduced to support long file names. Since there is no comment field
  +
in the root directory, the size of the volume name is still limited to 30 characters. However, one small enhancement was made to shorten the
introduced to support long file names. Since there is no comment field
 
  +
time it takes for the file system to initialize itself. Since the file system may have to report how many blocks are still available
in the root directory, the size of the volume name is still limited to
 
  +
for allocation, the typical path it takes involves counting all bits that are set in the bitmap blocks. This can take its time, especially
30 characters. However, one small enhancement was made to shorten the
 
  +
for larger volumes. This process is not strictly necessary as the information could be cached and later reread. This is what the long
time it takes for the file system to initialize itself. Since the
 
  +
name variant of the file system does. The number of allocated blocks is stored in the root directory. Also stored in the root directory is
file system may have to report how many blocks are still available
 
  +
the signature of the file system type. Here is how the root block would look like for a 512 byte block (note the sections enclosed in dashes):
for allocation, the typical path it takes involves counting all bits
 
that are set in the bitmap blocks. This can take its time, especially
 
for larger volumes. This process is not strictly necessary as the
 
information could be cached and later reread. This is what the long
 
name variant of the file system does. The number of allocated blocks is
 
stored in the root directory. Also stored in the root directory is
 
the signature of the file system type. Here is how the root block would
 
look like for a 512 byte block (note the sections enclosed in dashes):
 
   
 
<syntaxhighlight>
 
<syntaxhighlight>
Line 550: Line 447:
 
</syntaxhighlight>
 
</syntaxhighlight>
   
The 'NumBlocksUsed' field is valid only if the root block's 'BitmapFlag'
+
The 'NumBlocksUsed' field is valid only if the root block's 'BitmapFlag' field indicates that the bitmap is valid and consistent. Otherwise, it should be ignored. If the 'NumBlocksUsed' field happens to be zero, it also means that it should be ignored and that the number of allocated blocks will need to be recalculated from the bitmap on disk.
field indicates that the bitmap is valid and consistent. Otherwise, it
 
should be ignored. If the 'NumBlocksUsed' field happens to be zero, it
 
also means that it should be ignored and that the number of allocated
 
blocks will need to be recalculated from the bitmap on disk.
 
   
No change is made to the hashing algorithm. It is the same that is
+
No change is made to the hashing algorithm. It is the same that is being used for international mode, only that the names of the files
being used for international mode, only that the names of the files
 
 
can be longer than 30 characters.
 
can be longer than 30 characters.
   
In total, the number of changes made to support long file names are
+
In total, the number of changes made to support long file names are rather small. This was done in order to make the transition from the
  +
old Amiga file system types to the long file name mode smoother, and to reduce the risk of introducing errors into the design and implementation. In fact, one could write a conversion program that scans the file system and converts all directory entry blocks to the
rather small. This was done in order to make the transition from the
 
  +
new format without requiring the directory contents to be rehashed or additional blocks to be allocated.
old Amiga file system types to the long file name mode smoother, and
 
to reduce the risk of introducing errors into the design and
 
implementation. In fact, one could write a conversion program that
 
scans the file system and converts all directory entry blocks to the
 
new format without requiring the directory contents to be rehashed or
 
additional blocks to be allocated.
 
   
 
== Limitations and drawbacks ==
 
== Limitations and drawbacks ==
   
The LNFS mode exists in two variants only, this being the OFS mode and
+
The LNFS mode exists in two variants only, this being the OFS mode and the FFS mode. Except for the extensions to the name length these modes operate like the traditional OFS and FFS modes, respectively. There is no DCFS variant of the LNFS mode.
the FFS mode. Except for the extensions to the name length these modes
 
operate like the traditional OFS and FFS modes, respectively. There is
 
no DCFS variant of the LNFS mode.
 
   
The LNFS mode always uses the "international" directory entry name
+
The LNFS mode always uses the "international" directory entry name hashing operation.
hashing operation.
 
   
Because the changes required in order to support the long file names
+
Because the changes required in order to support the long file names of the LNFS mode are minimal, all the limitations of the original
of the LNFS mode are minimal, all the limitations of the original
 
 
Amiga OFS and FFS modes are still present.
 
Amiga OFS and FFS modes are still present.

Latest revision as of 20:05, 29 August 2016

Low level data structures used by the Amiga DCFS (1992) and LNFS (2001) by Olaf Barthel

Copyright (c) 2015 Olaf Barthel.

How the Amiga file system stores its data on disk is well-documented as far as the original file system (1985) and the Fast File System (1988) are concerned. The changes made for the Directory Caching File System (DCFS) and Long Name File System (LNFS) modes have never been described in detail. The term 'mode' refers to the fact that the respective changes are integrated with the existing file system data structures and algorithms and augment rather than replace them altogether.

This text attempts to describe in detail which purpose each respective file system mode serves, how the on-disk data structures look like, how they are integrated with the existing file system, and which drawbacks the respective modes have.

In this text familiarity with the 'C' programming language syntax for data structures is assumed, using the Amiga data types described in the <exec/types.h> header file. In each example a disk block size is 512 bytes in size.

Directory Caching File System (DCFS)

History and purpose

The DCFS mode was added to the Amiga ROM file system in Kickstart 3.0 (1992) by Commodore engineer Randell Jesup. The intended purpose was to make reading directory contents faster.

The Amiga file system stores directory contents in linked lists. There are 72 such lists in a 512 byte directory block, but not all of them may contain directory data. All entries whose names share the same hash value are stored in the same list. Reading the directory contents involves walking through each of these 72 lists in turn, skipping empty lists as needed.

Because the Amiga file system stores each entry of a directory in an individual block, reading a directory involves processing 512 bytes (or more; block sizes can be a large as 65536 bytes) while only a small part of the block data is relevant for directory scanning. Of interest are the type (directory, file, soft link, hard link), name, comment, modification time and protection flags. Taken together, this information accounts for up to 132 bytes. Because file names are typically not 30 characters long and comments are rare, the information retrieved from a directory entry block can be as short as 20-40 bytes.

The Amiga file system has no built-in controls to minimize fragmentation and to place directory contents close to one another. Reading the contents of a directory can involve visiting directory entry blocks which are widely scattered across the volume.

The DCFS mode attempts to reduce the amount of data which needs to be retrieved when a directory is being scanned, and also to minimize the number of disk accesses that have to take place. This is achieved by introducing a new file system block type which stores a compact form of the directory contents. This new file system block type is the "Directory list block", which implements a "cache" of the directory contents. This is what gives the Directory Caching File System its name.

Data structure changes and new data structures

The DCFS mode reuses the Amiga file system data structures which are utilized in the original file system (OFS) and the Fast File System (FFS) and builds upon them by adding directory lists blocks. The directory list blocks are attached to the root directory and all user directories through a field which was, up to 1992, unused.

Standard root and user directory blocks (512 bytes) for OFS and FFS would look like this:

   struct RootBlock {
        LONG    Type;           /* This is used to mark the type of this
                                   block and is set to TYPE_SHORT (2) */
        ULONG   OwnKey;         /* Not used by root, must be set to 0 */
        ULONG   SeqNum;         /* Not used by root, must be set to 0 */
        ULONG   HTSize;         /* Size of the hash table in longwords,
                                   must be set to 72 */
        ULONG   Reserved1;      /* reserved for future revs, must be 0 */
        LONG    Checksum;       /* balance to 0 checksum.  When all longs
                                   in the block are added (ignoring carry)
                                   the sum should be 0 */
        ULONG   HashTable[72];  /* hash table containing block numbers of
                                   files and directories in the root */
        LONG    BitmapFlag;     /* flag to say whether the bitmap is valid
                                   or not.  -1=valid. 0=invalid.  If a
                                   partition is mounted (or uninhibited)
                                   with BitmapFlag = 0 then the validator
                                   will kick in to rebuild the bitmap */
        ULONG   BitmapKeys[25]; /* An array of block numbers for the bitmap
                                   on this partition.  A block number of 0
                                   indicates the end of the list. */
        ULONG   BitmapExtend;   /* If all of the BitmapKeys have been used
                                   then this will contain the block number
                                   of a disk block containing more Bitmap-
                                   Keys.  0 means no extender block present.
                                   OLDFS has a bug where it assumes that
                                   the root has protection bits and attempts
                                   to set them.  Since they coincide with this
                                   field, this is why there is a 53Meg limit
                                   on partition size under OLDFS */
        ULONG   DirAltered[3];  /* A DOS DateStamp indicating the date when
                                   the root block was last modified or a
                                   file in the root block was modified */
        char    Name[40];       /* Name of this volume as a BCPL string
                                   with number of bytes as the first
                                   character.  Only 30 chars used */
        ULONG   DiskAltered[3]; /* A DOS DateStamp indicating the date when
                                   any file or section of the partition was
                                   modified.  (FFS has a bug which prevents
                                   it from updating this correctly, the
                                   DirAltered date gets changed instead) */
        ULONG   DiskMade[3];    /* A DOS DateStamp indicating the date when
                                   this partition was first formatted and
                                   therefore created */
        ULONG   Reserved2;      /* reserved for future revs, must be 0 */
        ULONG   Reserved3;      /* reserved for future revs, must be 0 */
        ULONG   Reserved4;      /* reserved for future revs, must be 0 */
        LONG    SecondaryType;  /* Qualifier to Type.  Will be set to
                                   ST_ROOT (1) */
   };
   struct UserDirectoryBlock {
        LONG    Type;           /* This is used to mark the type of this
                                   block and is set to TYPE_SHORT (2) */
        ULONG   OwnKey;         /* Set to directory's own block number */
        ULONG   Reserved1;      /* Not used, must be set to 0 */
        ULONG   Reserved2;      /* Not used, must be set to 0 */
        ULONG   Reserved3;      /* Not used, must be set to 0 */
        LONG    Checksum;       /* balance to 0 checksum.  When all longs
                                   in the block are added (ignoring carry)
                                   the sum should be 0 */
        ULONG   HashTable[72];  /* hash table containing block numbers of
                                   files and directories in this directory */
        LONG    Reserved4;      /* Not used, must be set to 0 */
        LONG    OwnerXID;       /* Owner UID/GID */
        ULONG   Protection;     /* Protection bits for this directory */
        LONG    Reserved6;      /* Not used, must be set to 0 */
        char    Comment[92];    /* Directory comment as a BCPL string, only
                                   80 characters can be used including the
                                   length byte at the beginning */
        ULONG   Created[3];     /* DOS DateStamp struct indicating when
                                   this directory was created or modified */
        char    DirName[36];    /* name of this directory as a BCPL string.
                                   only 30 characters are used */
        LONG    Reserved7[7];   /* Not used, must be set to 0 */
        ULONG   HashChain;      /* block number of the next file on the
                                   hashchain if there's a hashing collision.
                                   0 means no next file or directory. */
        ULONG   Parent;         /* Block number of the parent directory of
                                   this directory.  */
        ULONG   Reserved8;      /* Not used. must be set to 0 */
        LONG    SecondaryType;  /* Qualifier to Type.  Will be set to
                                   ST_USERDIR (2) */
 
   };

In DCFS mode the RootBlock.Reserved4 and the UserDirectoryBlock.Reserved8 fields, respectively, contain the block number of the first directory list block for the directory.

Each directory list block begins with a header consisting of six fields, which is then followed by a variable number of list entries. The directory list block header looks as follows:

   struct FileListBlock {
        LONG    Type;           /* This is used to mark the type of this
                                   block and is set to TYPE_DIRLIST_REV1 (33) */
        ULONG   OwnKey;         /* Set to file list block's own block number */
        ULONG   Parent;         /* Block number of the directory which this
                                   directory list block is associated with  */
        ULONG   NumEntries;     /* Number of directory entries in this block */
        ULONG   NextBlock;      /* Block number of next block in the list, or 0 */
        LONG    Checksum;       /* Balance to 0 checksum.  When all longs
                                   in the block are added (ignoring carry)
                                   the sum should be 0 */
   };

Each directory list entry begins with a header of constant size, which is followed by a variable number of bytes which contain the name and the comment text. The directory list entry is a "digest" of the directory/file/hard link/soft link block contents, which includes the name, comment and other metadata:

   struct FileListEntry {
        ULONG   Key;            /* Number of the block which this
                                   entry is a "digest" of; this would be a
                                   file/directory/hard link/soft link header
                                   block */
        ULONG   DataSize;       /* Number of bytes in this file; relevant only
                                   for files */
        ULONG   Protection;     /* Protection bits for this file/directory */
        LONG    OwnerXID;       /* Owner UID/GID */
        UWORD   Created[3];     /* Truncated DOS DateStamp struct indicating when
                                   this directory entry was created or modified */
        BYTE    Type;           /* Directory entry type, e.g. ST_USERDIR (2),
                                   ST_FILE (-3), etc. */
        UBYTE   Name[];         /* Name as variable length BCPL string */
        UBYTE   Comment[];      /* Name as variable length BCPL string */
        UBYTE   Pad;            /* Padding byte if needed */
   };

How directory lists are constructed

The root directory and each user directory must contain a valid directory list. The RootBlock.Reserved4 and the UserDirectoryBlock.Reserved8 fields, respectively, contain the block number of the first directory list block for the root block/directory. This block number must never be 0.

The FileListBlock.Type field must be set to TYPE_DIRLIST_REV1 (33). While the Amiga file system for Kickstart 3.0 was under development, Commodore briefly supported a different type of directory list which can be identified as type TYPE_DIRLIST_REV0 (32). Details of the differences between TYPE_DIRLIST_REV1 and TYPE_DIRLIST_REV0 will be described later in the context of the directory list entry format.

Each directory list block is part of a chain, in which the FileListBlock.NextBlock field contains the block number of the next following entry in this chain, or 0 to indicate that no other directory list block follows.

The number of entries in this directory list block is stored in FileListBlock.NumEntries. A value of 0 indicates that this block has no directory entries stored.

How directory list entries are constructed for TYPE_DIRLIST_REV1 (33)

If the FileListBlock.Type field is set to TYPE_DIRLIST_REV1 (33) then you are dealing with the standard form of the directory list block. This is the only form which should be used in production code.

Directory list entries directly follow the directory list block header.

Each entry consists of a constant size portion (FileListEntry), followed by a name and comment string in BCPL format (i.e. a single length byte is followed by exactly as many characters as indicated by the contents of the length byte; there is no terminating NUL byte). If the total size of the constant part, the name and the comment string is an odd number, then a padding byte must be added to the entry. The padding byte should be set to 0.

Because the total size of the FileListEntry is not stored within itself, it must be calculated by looking at how long the name and comment data is, including the length byte which introduces each string. If the number obtained by adding the size of the FileListEntry and the name and comment strings (in BCPL format) is odd, 1 must be added to make it an even number (accounting for the required padding byte).

The number of directory list entries must match the value of the directory list block FileListBlock.NumEntries field.

If there is enough spare room left in the directory list block, then the list of directory list entries must be terminated by an ULONG value of 0. This looks like a partial FileListEntry with the Key field set to 0.

The contents of FileListEntry.Created field are a truncated form of the 'struct DateStamp' which can be found, for example in the UserDirectoryBlock.Created field. The seconds and microseconds fit completely within the truncated form, but the number of days does not. As such the date range is restricted to 179 years instead of 4 billion years.

How directory list entries are constructed for TYPE_DIRLIST_REV0 (32)

If the FileListBlock.Type field is set to TYPE_DIRLIST_REV0 (32) then you are dealing with a non-standard form of the directory list block which was the precursor to the standard form. Never use the TYPE_DIRLIST_REV0 (32) form in production code. The following documentation is provided for the sake of completeness only.

If the FileListBlock.Type field is set to TYPE_DIRLIST_REV0 then the format of the directory list entries is slightly different:

   struct FileListEntry_Rev0 {
        ULONG   Key;            /* Number of the block which this
                                   entry is a "digest" of; this would be a
                                   file/directory/hard link/soft link header
                                   block */
        ULONG   DataSize;       /* Number of bytes in this file; relevant only
                                   for files */
        ULONG   Protection;     /* Protection bits for this file/directory */
        UWORD   Created[3];     /* Truncated DOS DateStamp struct indicating when
                                   this directory entry was created or modified */
        BYTE    Type;           /* Directory entry type, e.g. ST_USERDIR (2),
                                   ST_FILE (-3), etc. */
        UBYTE   Name[];         /* Name as variable length BCPL string */
        UBYTE   Comment[];      /* Name as variable length BCPL string */
        UBYTE   Pad;            /* Padding byte if needed */
   };

In revision 0 the OwnerXID field is omitted from the constant size directory list entry header.

Limitations and drawbacks

Because a copy of the metadata of the directory contents is made, this copy needs to be updated whenever the original changes. This alone makes DCFS mode slower than the comparable OFS/FFS variant when directory contents are modified.

Additional complications may arise if a name or comment is changed so that the corresponding directory list entry becomes too large to fit into its directory list block. In this case additional storage space will need to be allocated for a new directory list block, the old block will have to be rewritten, the new block will have to be initialized and linked up. If no storage space is available, the rename/comment change operation will have to be undone. Both the successful case (new entry created) and the failing case (not enough store space, undoing the change) are making a directory update significantly more complex.

Maintaining the consistency of the directory cache is the responsibility of the file system validation process. If the volume needs to be rebuilt, two passes are necessary in order to first find out which data blocks are used, and then to rebuild the directory cache. This will easily double the total validation time.

Due to the complexity of maintaining the directory cache, any update takes significantly longer than in the comparable OFS/FFS variants of the file system, and it modifies more data structures. During this operation the file system data structures are not entirely consistent at all times, and if the system should crash or the power should fail, then a validation may stall because of inconsistencies. The file system will have to be repaired. The DCFS mode makes this case more likely to appear and then adds insult to injury by making the validation process at least twice as long as with the comparable OFS/FFS variants of the file system.

There is no "garbage collection" process for the directory cache. If most of the entries are removed from a directory list block no merging of neighbouring directory list blocks takes place. If all entries are removed from a directory list block, that block is not removed from the directory list and its storage space is not reclaimed. Only deleting the empty directory which contains the empty directory list chain will free that memory.

Long Name File System (LNFS)

History and purpose

The LNFS mode was introduced early during development of the FFS reimplementation which eventually shipped with AmigaOS 4. These changes were introduced by Olaf Barthel, who created the FFS reimplementation in 2001.

The Amiga operating system has always supported long file names in the dos.library functions which retrieved the contents of directories (107 characters per entry). The Kickstart 3.0 ExAll() API function did not even place an upper limit on the length of a directory entry name. However, the default ROM file system only allowed for up to 30 characters per name to be used.

The LNFS mode removes the 30 character limitation for the names of files, directories, hard links and soft links. The root directory name is, however, still restricted to 30 characters.

There is an additional feature in LNFS which is intended to improve performance when an LNFS volume is mounted. The file system no longer requires that the bookkeeping data structure (the "bitmap") which tracks the blocks currently allocated for data to be scanned completely before a read or write access can be made. Reading the entire bitmap just for the purpose of counting the number of bits set is costly. LNFS stores the number of blocks used as a counter value in the root directory. This counter will be updated for each change made to the bitmap, and should a validation be required, then the counter will be updated to reflect the status of the bitmap after validation.

Data structure changes and new data structures

The long file name support was added by changing the layout of the canonical file/directory header block and by introducing a new block type. To illustrate, here is the default layout of a directory entry with 512 bytes per block; note the section enclosed in dashes:

   struct UserDirectoryBlock {
      LONG  Type;          /* This is used to mark the type of this
                              block and is set to TYPE_SHORT (2) */
      ULONG OwnKey;        /* Set to directory's own block number */
      ULONG Spare1[3];     /* Not used, must be set to 0 */
      LONG  Checksum;      /* balance to 0 checksum.  When all longs
                              in the block are added (ignoring carry)
                              the sum should be 0 */
      ULONG HashTable[72]; /* hash table containing block numbers of
                              files and directories in this directory */
      LONG  Spare2;        /* Not used, must be set to 0 */
      UWORD OwnerID;       /* ID of owner that may access this directory */
      UWORD GroupID;       /* ID of group that may access this directory */
      ULONG Protection;    /* Protection bits for this directory */
      LONG  Spare3;        /* Not used, must be set to 0 */
----------------------------------------------------------------------
      char  Comment[80];   /* Directory comment as a BCPL string, only
                              80 characters can be used including the
                              length byte at the beginning */
      LONG  Spare4[3];     /* Not used, must be set to 0 */
      ULONG Created[3];    /* DOS DateStamp struct indicating when
                              this directory was created or modified */
      char  DirName[32];   /* name of this directory as a BCPL string.
                              only 30 characters are used */
----------------------------------------------------------------------
      LONG  Spare5[2];     /* Not used, must be set to 0 */
      ULONG FirstLink;     /* Number of first hard link block that points
                            * to this directory */
      LONG  Spare6[5];     /* Not used, must be set to 0 */
      ULONG HashChain;     /* block number of the next file on the
                              hashchain if there's a hashing collision.
                              0 means no next file or directory. */
      ULONG Parent;        /* Block number of the parent directory of
                              this directory. */
      ULONG DirList;       /* Number of associated directory cache
                              block list. */
      LONG  SecondaryType; /* Qualifier to Type.  Will be set to
                              ST_USERDIR (2) */
   };

The section between the dashes is common to all directory entries, i.e. directories, files, hard links and soft links. Each of these blocks contains a comment, a modification/creation date stamp and a name.

A little research revealed that comments are used rather rarely with the Amiga file system. At best, special applications such as web browsers or FTP clients use the comment field to store additional information about cache entries or the origin of downloaded files. While the comment field is definitely useful, the space it occupies would be better put to use to allow for longer file names. Here is how the layout of the block was changed to allow for longer file names:

   struct LongNameUserDirectoryBlock {
      LONG  Type;          /* This is used to mark the type of this
                              block and is set to TYPE_SHORT (2) */
      ULONG OwnKey;        /* Set to directory's own block number */
      ULONG Spare1[3];     /* Not used, must be set to 0 */
      LONG  Checksum;      /* balance to 0 checksum.  When all longs
                              in the block are added (ignoring carry)
                              the sum should be 0 */
      ULONG HashTable[72]; /* hash table containing block numbers of
                              files and directories in this directory */
      LONG  Spare2;        /* Not used, must be set to 0 */
      UWORD OwnerID;       /* ID of owner that may access this directory */
      UWORD GroupID;       /* ID of group that may access this directory */
      ULONG Protection;    /* Protection bits for this directory */
      LONG  Spare3;        /* Not used, must be set to 0 */
----------------------------------------------------------------------
      char  NaC[112];      /* Merged name and comment */
      ULONG CommentBlock;  /* Number of the block which holds the
                              associated comment string */
      LONG  Spare4[2];     /* Not used, must be set to 0 */
      ULONG Created[3];    /* DOS DateStamp struct indicating when
                              this directory was created or modified */
----------------------------------------------------------------------
      LONG  Spare5[2];     /* Not used, must be set to 0 */
      ULONG FirstLink;     /* Number of first hard link block that points
                            * to this directory */
      LONG  Spare6[5];     /* Not used, must be set to 0 */
      ULONG HashChain;     /* block number of the next file on the
                              hashchain if there's a hashing collision.
                              0 means no next file or directory. */
      ULONG Parent;        /* Block number of the parent directory of
                              this directory. */
      ULONG DirList;       /* Number of associated directory cache
                              block list. */
      LONG  SecondaryType; /* Qualifier to Type.  Will be set to
                              ST_USERDIR (2) */
   };

As you can see, the name and comment fields were merged and the date stamp pushed down a few long words.

In the new layout the 'NaC' entry contains both the name and the comment, as two BCPL strings. The name and comment are stored that way only if both will fit into the 112 bytes available. For example, a directory by the name of "barney" with a comment "fred" will have the following bytes stored in the 'NaC' field (the numbers in brackets are the respective BCPL string length bytes):

  [6] b a r n e y [4] f r e d

By comparison, a directory by the name of "wilma" with no comment stored in the directory block would have the following bytes in the 'NaC' field:

  [5] w i l m a [0]

If the comment string stored in this character array is empty, it does not necessarily mean that the respective directory has no comment attached. It might mean that the name and the comment do not both fit into the directory entry block and that the comment is stored separately in its own block. If the latter is the case, you will find the number of the comment block in the 'CommentBlock' field; if this number is zero, then there is no comment block associated with this directory entry. The comment block's layout looks like this:

   struct CommentBlock {
      LONG  Type;        /* This is used to mark the type of this
                            block and is set to TYPE_COMMENT (64) */
      ULONG OwnKey;      /* Set to comment block's own block number */
      ULONG HeaderKey;   /* The number of the directory entry block
                            which the comment is associated with */
      ULONG Spare1[2];   /* Not used, must be set to 0 */
      LONG  Checksum;    /* balance to 0 checksum.  When all longs
                            in the block are added (ignoring carry)
                            the sum should be 0 */
      char  Comment[80]; /* Directory comment as a BCPL string, only
                            80 characters can be used including the
                            length byte at the beginning */
      ULONG Spare2[102]; /* Not used, must be set to 0 */
   };

There would be more room for the comment to be stored, but unfortunately, the AmigaDOS API allows only for up to 79 characters to be used for a comment string, as returned using the 'ExAll()' and 'Examine()/ExNext()' routines.

Note that the root directory itself is unaffected by the changes introduced to support long file names. Since there is no comment field in the root directory, the size of the volume name is still limited to 30 characters. However, one small enhancement was made to shorten the time it takes for the file system to initialize itself. Since the file system may have to report how many blocks are still available for allocation, the typical path it takes involves counting all bits that are set in the bitmap blocks. This can take its time, especially for larger volumes. This process is not strictly necessary as the information could be cached and later reread. This is what the long name variant of the file system does. The number of allocated blocks is stored in the root directory. Also stored in the root directory is the signature of the file system type. Here is how the root block would look like for a 512 byte block (note the sections enclosed in dashes):

   struct RootBlock {
      LONG  Type;           /* This is used to mark the type of this
                               block and is set to TYPE_SHORT (2) */
      ULONG OwnKey;         /* Not used by root, must be set to 0 */
      ULONG SeqNum;         /* Not used by root, must be set to 0 */
      ULONG HTSize;         /* Size of the hash table in longwords,
                               must be set to 72 */
      ULONG Reserved0;      /* reserved for future revs, must be 0 */
      LONG  Checksum;       /* balance to 0 checksum.  When all longs
                               in the block are added (ignoring carry)
                               the sum should be 0 */
      ULONG HashTable[72];  /* hash table containing block numbers of
                               files and directories in the root */
      LONG  BitmapFlag;     /* flag to say whether the bitmap is valid
                               or not.  -1=valid. 0=invalid.  If a
                               partition is mounted (or uninhibited)
                               with BitmapFlag = 0 then the validator
                               will kick in to rebuild the bitmap */
      ULONG BitmapKeys[25]; /* An array of block numbers for the bitmap
                               on this partition.  A block number of 0
                               indicates the end of the list. */
      ULONG BitmapExtend;   /* If all of the BitmapKeys have been used
                               then this will contain the block number
                               of a disk block containing more Bitmap-
                               Keys.  0 means no extender block present.
                               OLDFS has a bug where it assumes that
                               the root has protection bits and attempts
                               to set them.  Since they coincide with this
                               field, this is why there is a 53Meg limit
                               on partition size under OLDFS */
      ULONG DirAltered[3];  /* A DOS DateStamp indicating the date when
                               the root block was last modified or a
                               file in the root block was modified */
      char  Name[32];       /* Name of this volume as a BCPL string
                               with number of bytes as the first
                               character.  Only 30 chars used */
      ULONG Reserved1;      /* reserved for future revs, must be 0 */
----------------------------------------------------------------------
      ULONG NumBlocksUsed;  /* Total number of blocks allocated
                               by bitmap */
----------------------------------------------------------------------
      ULONG DiskAltered[3]; /* A DOS DateStamp indicating the date when
                               any file or section of the partition was
                               modified. */
      ULONG DiskMade[3];    /* A DOS DateStamp indicating the date when
                               this partition was first formatted and
                               therefore created */
----------------------------------------------------------------------
      ULONG FileSystemType; /* The file system signature for this
                               partition, e.g. DOS\6 or DOS\7. This
                               repeats the signature of the first
                               partition block. Note that this entry
                               is zero for all other file system
                               types */
----------------------------------------------------------------------
      ULONG Reserved2;      /* reserved for future revs, must be 0 */
      ULONG DirList;        /* Number of associated directory cache
                               block list. */
      LONG  SecondaryType;  /* Qualifier to Type.  Will be set to
                               ST_ROOT (1) */
   };

The 'NumBlocksUsed' field is valid only if the root block's 'BitmapFlag' field indicates that the bitmap is valid and consistent. Otherwise, it should be ignored. If the 'NumBlocksUsed' field happens to be zero, it also means that it should be ignored and that the number of allocated blocks will need to be recalculated from the bitmap on disk.

No change is made to the hashing algorithm. It is the same that is being used for international mode, only that the names of the files can be longer than 30 characters.

In total, the number of changes made to support long file names are rather small. This was done in order to make the transition from the old Amiga file system types to the long file name mode smoother, and to reduce the risk of introducing errors into the design and implementation. In fact, one could write a conversion program that scans the file system and converts all directory entry blocks to the new format without requiring the directory contents to be rehashed or additional blocks to be allocated.

Limitations and drawbacks

The LNFS mode exists in two variants only, this being the OFS mode and the FFS mode. Except for the extensions to the name length these modes operate like the traditional OFS and FFS modes, respectively. There is no DCFS variant of the LNFS mode.

The LNFS mode always uses the "international" directory entry name hashing operation.

Because the changes required in order to support the long file names of the LNFS mode are minimal, all the limitations of the original Amiga OFS and FFS modes are still present.