Re: TCP RDMA option to accelerate NFS, CIFS, SCSI, etc.

To: ips@ece.cmu.edu, tcp-impl@grc.nasa.gov
Subject: Re: TCP RDMA option to accelerate NFS, CIFS, SCSI, etc.
From: Vernon Schryver <vjs@calcite.rhyolite.com>
Date: Sun, 27 Feb 2000 17:44:12 -0700 (MST)
Delivery-Date: Sun Feb 27 19:44:49 2000
Sender: owner-ips@ece.cmu.edu

> From: "David R. Cheriton" <cheriton@cisco.com>

> ...
>   Clearly, data is being received from hardware and software does not
> get to touch it until it has been stored to some memory.  My 
> assumption is that the storage system memory is arranged in fixed
> size pages of disk/file pages.  Without hardware RDMA to the storage
> level, I believe one requires an extra copy, from whatever the
> hardware delivers to what the storage system expects.  Either
> you use twice the bandwidth in the storage system memory system or
> or else you have a separate memory system for the network, and
> have software/processor power adequate to copy between at wire
> speed (with all the associated support facilities for this processor.)
>  Unless there is something wrong with this reasoning,
> it seems like a cost issue of providing the above hardware resources
> vs. providing a NIC chip that can RDMA.  

Depending on how you are counting copies, that reasoning has been wrong
in commercial UNIX systems for more than 10 years.
Do you use the RDMA bits before IP checksum, the TCP checksum, and the
medium FCS or checksum have been checked?  If not, if you receive the
entire link layer frame into some kind of temporary buffer or FIFO,
probably in the "network interface card/controller," to check the trailing
FCS and before using the RDMA bits, then commercial UNIX systems have been
doing as you say to save copies since the late 1980's.  As I said before,
such systems were a part of what killed Protocol Engines Inc.

If you do use the RDMA bits in the TCP header after 50-60 bytes of
the frame have arrived, but before the frame FCS, aren't you worried
about bit rot in the RDMA?

> My guessitimate is that the software-only approach would be easily
> 10 times more expensive here at the higher speed rates, of 10 Gbps.
> If there is serious doubt about the merits of real hardware support,
> we should try to quantify costs further at these speed ranges, IMHO.

By "expensive," are you talking about dollars or bits/second?

Regardless, if you look at the number of CPU cycles or gates in custom
silicon required to support incoming page flipping in old, existing
implementations, I bet you'll find that they are less "expensive" than
any likely RDMA implementation.  Power of 2 modular arithmetic is awfully
cheap compared to parsing and validating TCP options.


> ...
> It would help me to have a more careful definition of the types of
> attacks you have in mind.  In an unsecure network with intruders,
> presumably I can end up with bad data in the right buffer
> or right data in the wrong buffer without using RDMA.
> Do you view we have made things worse, and if so, how?
> or are you objecting to us not making things better?

Is it possible for a bad guy to use RDMA to put bad data into memory
that is not a buffer?

If the RID does no more than choose from a safe list of buffers, then how
does RDMA usefully differ from the old FDDI, ATM, and HIPPI implementations
that put incoming page-flippable data in buffers that get into user space
with the data having been seen on the system bus the absolute minimum
number of times for any scheme, including RDMA, once?
Systems I've worked on have done mbuf allocation in the network interface
hardware, including putting page-flippable payloads into page-mbufs that
can eventually be flipped into user space.  And of course, take care of
the TCP or UDP checksum.

Given the recently described extensions to readv(), absolutely
all data received by a system like that would be page-flippable,
and without needing the silicon or CPU cycles to parse RDMA options
or requiring the sender to send RDMA options or even know that the
receiver is being fast.


Vernon Schryver    vjs@rhyolite.com

Follow-Ups:
- NFS Header/data parsing and RDMA
  - From: Costa Sapuntzakis <csapuntz@cisco.com>

Prev by Date: Re: TCP RDMA option to accelerate NFS, CIFS, SCSI, etc.
Next by Date: Re: TCP RDMA option to accelerate NFS, CIFS, SCSI, etc.
Prev by thread: Re: TCP RDMA option to accelerate NFS, CIFS, SCSI, etc.
Next by thread: NFS Header/data parsing and RDMA
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:08:17 2001
6315 messages in chronological order