NASD Enabling Technologies

[ NASD Home |
Work at CMU |
Related Work |
Technology Transfer ]
[ DARPA Highlights
| Recent Talks
| Publications
| NASD Code
Downloads ]
Storage architecture is ready to change
as a result of the synergy between five overriding factors:
Traditional distributed filesystem workloads
are dominated by small random accesses to small files whose sizes are slowly
growing with time. In contrast, new workloads are much more I/O-bound,
including data types such as video and audio, and applications such as
data mining of retail transactions, medical records, or telecommunication
call records.
Disk bandwidth is increasing at rate
of 40% per year. High transfer rates have increased pressure on the physical
and electrical design of drive busses, dramatically reducing maximum bus
length. At the same time, people are building systems of clustered computers
with shared storage. Therefore, the storage industry is beginning to encapsulate
drive communication over Fibrechannel, a serial, switched, packet-based
peripheral network that supports long cable lengths, more ports, and more
bandwidth. NASD will evolve the SCSI command set that is currently being
encapsulated over Fibrechannel to take full advantage of the switched-network
technology for both higher bandwidth and increased flexibility.
The increasing transistor density in
inexpensive ASIC technology has lowered disk drive cost and increased performance
by integrating sophisticated special-purpose functional units into a small
number of chips. Figure 1 shows the block diagram for the ASIC at the heart
of Quantum’s Trident drive. When drive ASIC technology advances from 0.68
micron CMOS to 0.35 micron CMOS, they will be able to insert a 200 MHz
StrongARM microcontroller, leaving space equivalent to 100,000 gates for
functions such as onchip DRAM or cryptographic support. While this may
seem like a major jump, Siemen’s TriCore integrated microcontroller and
ASIC architecture promises to deliver a 100 MHz, 3-way issue, 32-bit datapath
with up to 2 MB of onchip DRAM and customer defined logic in 1998.
Figure 1: Quantum’s
Trident disk drive features the ASIC on the left (a). Integrated onto this
chip in four independent clock domains are 10 function units with a total
of about 110,000 logic gates and a 3 KB SRAM: a disk formatter, a SCSI
controller, ECC detection, ECC correction, spindle motor control, a servo
signal processor and its SRAM, a servo data formatter (spoke), a DRAM controller,
and a microprocessor port connected to a Motorola 68000 class processor.
By advancing to the next higher ASIC density, this same die area could
also accommodate a 200 MHz StrongARM microcontroller and still have space
left over for DRAM or additional functional units such as cryptographic
or network accelerators.
Scalable computing is increasingly based
on clusters of workstations. In contrast to the special-purpose, highly
reliable, low-latency interconnects of massively parallel processors such
as the SP2, Paragon, and Cosmic Cube, clusters typically use Internet protocols
over commodity LAN routers and switches. To make clusters effective, low-latency
network protocols and user-level access to network adapters have been proposed,
and a new adapter card interface, the Virtual Interface Architecture, is
being standardized. These developments continue to narrow the gap between the
channel properties of peripheral interconnects and the network properties of
client interconnects and make Fibrechannel and Gigabit Ethernet look more alike than
different.
In high performance distributed filesystems,
there is a high cost overhead associated with the server machine that manages
filesystem semantics and bridges traffic between the storage network and
the client network. Figure 2 illustrates this problem for bandwidth-intensive
applications. Based on cost and peak performance estimates, we compare
the expected overhead cost of a storage server as a fraction of the raw
storage cost. Servers built from high-end components have an overhead that
starts at 1,300% for one server-attached disk! Assuming dual 64-bit PCI
busses that deliver every byte into and out of memory once, the high-end
server saturates with 14 disks, 2 network interfaces, and 4 disk interfaces
with a 115% overhead cost. The low cost server is more cost effective.
One disk suffers a 380% cost overhead and, with a 32-bit PCI bus limit,
a six disk system still suffers an 80% cost overhead. While we can not
accurately anticipate the marginal increase in the cost of a NASD over
current disks, we estimate that the disk industry would be happy to charge
10% more. This bound would mean a reduction in server overhead costs of
at least a factor of 10 and in total storage system cost (neglecting the
network infrastructure) of over 50%.
Figure 2: Cost
model for the traditional server architecture. In this simple model, a
machine serves a set of disks to clients using a set of disk (wide Ultra
and Ultra2 SCSI) and network (Fast and Gigabit Ethernet) interfaces. Using
peak bandwidths and neglecting host CPU and memory bottlenecks, we estimate
the server cost overhead at maximum bandwidth as the sum of the machine
cost and the costs of sufficient numbers of interfaces to transfer the
disks’ aggregate bandwidth divided by the total cost of the disks. While
the prices are probably already out of date, the basic problem of a high
server overhead is likely to remain. We report pairs of costs and bandwidth
estimates. On the left, we show values for a low cost system built from
high-volume components. On the right, we show values for a high-performance
reliable system built from components recommended for mid-range and enterprise
servers. |