Most device drivers for disk controllers provide an interface based on
reading and writing sectors. The I/O modules
abstract these sectors into blocks. There are NASD_OD_SECTORS_PER_BLK
sectors in each block. Blocks contain NASD_OD_BASIC_BLOCKSIZE
bytes.
The NASD drive may be thought of as a simple flat filesystem, in which filenames
are replaced with NASD identifiers. Each file is represented by an inode
(type nasd_od_node_t
).
The inode contains key metadata for each object, including timestamps, logical
length, filesystem-specific data, and other information. If the object length
is less than or equal to NASD_ND_ATOMIC_SIZE
bytes, then the entirety
of the object data may be stored in the inode. This condition is indicated by
setting a status bit in the inode. The inode also contains pointers to direct
and indirect blocks which correspond to the data contents of the object. Any
such pointer at any level of indirection may be set to block number zero, which
implies that the logical object space pointed to by these physical pointers
is zero-filled.
The majority of the disk is what is known as the data region. This region
is addressed by physical block numbers (which have type
nasd_blkno_t
). All inodes, indirect blocks, and direct blocks
are located in this region.
It is worth noting that the partitions implemented by the drive are soft partitions. That is, although the storage allocated to objects within a partition may never exceed the size of the partition, there is no physical region of the drive bound to a partition. The inode, data, and indirect blocks of objects in different partitions may freely intermingle. Only the layout module is concerned with where these blocks are placed.
Each disk has a header structure of type nasd_od_disk_t
which
contains key information such as blocks allocated to various partitions, the
total number of unallocated blocks, the drive-global keys, the layout type
the drive is using, and the location of the inode hash table. This structure
occupies a single physical sector. Two copies of this structure are maintained -
one at each end of the disk (first and last sectors). The structures contain
timestamps, which allows recovery operations to determine which version
is newer (and to allow more efficient updates). Inodes locations are recorded
in the inode hash table. The inode number forms an index into this table,
which is recorded on disk, as described below.
Following the disk header is a region of reference count blocks. These blocks maintain reference counts on the blocks in the data region, as well as the blocks in the inode hash table. These reference counts are intended to support blockwise copy-on-write, but that is not yet implemented.
Two copies of the inode hash table are maintained; both are located after the reference
count blocks. The reference count on an inode hash table block represents
the number of inodes which actually exist in the block. Thus, the number
of empty slots in an inode page table block may be determined without
reading in the block by examining its reference count. This is an optimization
to reduce the amount of I/O needed to create a new object.
![]() | ![]() | ![]() |
---|---|---|
Drive transport | Drive types | NASD Programmer's Documentation |