SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: A question on Zero Copy


    • To: ips@ece.cmu.edu
    • Subject: Re: A question on Zero Copy
    • From: Pierre Labat <pierre_labat@hp.com>
    • Date: Wed, 06 Dec 2000 00:48:43 -0800
    • Content-Type: multipart/mixed;boundary="------------D2367FF3E2D47EEB51057A8C"
    • Organization: Hewlett Packard ATM-SISL
    • Sender: owner-ips@ece.cmu.edu

    Hello,
    
    
    
    I take the opportunity of this thread
    to say that the iSCSI protocol needs alignment.
    
    Why ?
    
    It is to save (a lot of) CPU cycles for the "software implementations"
    when they process the incoming (from the network) iSCSI PDUs.
    I mean by software implementations the ones that use a legacy Ethernet
    adapter or a generic TOE (TCP offload engine on the adapter) adapter.
    
    For these applications we cannot place the incoming data
    directly at the final place. A copy needs to be done to
    copy data from the anonymous buffers receiving the incoming
    iSCSI traffic to the final location. This copy is done
    by the CPU.
    
    To ease this painfull task, iSCSI would allow the CPU
    to copy the maximum for each cycle. That is 8 bytes
    with the current CPUs.
    
    Efforts have been made to keep the header with a size that
    is a multiple of 8 bytes, however,
    the PDU headers and data may not be aligned. Even if most of the
    time the data transferred is a multiple of blocks and headers
    size is a multiple of 8 bytes it doesn't
    guarantee the alignment. The alignment can be lost because of
    a command with parameters,sense data,...
    We actually saw that in prototypes.
    The problem is once the alignment is lost
    the data flow can stay not aligned for ever in the worst case.
    
    It is an important penalty for:
    - the initiators (host) using a TOE card or regular Ethernet card,
      as they have to copy (done by the CPU) all the inbound data to the
      final location.
    - a less important penalty is that the headers must be copied too.
      Because fields in the header have to be accessed as multibytes integer.
      And the CPU requires that integers must be aligned on a multiple of
      the integer size (else panic).
    
    It can impact the targets too, depending on how they are implemented.
    For example if the target uses a CPU to copy data from
    the "recirculation buffer" to the cache it will be impacted.
    
    
    To get the best performance, copy must use instructions that
    copy 8 bytes at a time (double word).
    Hence you use 8 times less CPU cycles than with a byte copy.
    
    These instructions require that
    the 8 bytes source must be aligned on an address
    that is a multiple of the double word size (8 bytes).
    Same thing for the 8 bytes destination.
    
    Even assuming that the copy is optimized (check the
    alignment to limit the number of instructions during the copy
    (use copy double if possible,
    then copy word, then copy halfword, then copy byte finally))
    and assuming that the alignment is uniformely distributed
    the average size that can be copied in one CPU cycle
    is only:
    
    8*1/8 + 4*1/8 + 2*1/4 + 1*1/2 = 2.5 bytes.
    
    If the double word alignment is guaranteed the
    size copied at each cycle would be 8 bytes.
    Hence the performance penalty is that 3 times
    more CPU cycles are needed to copy the data to final location
    compared to what is necessary.
    We have to add the penalty of checking the alignment
    (negligeable) and the penalty of copying the headers
    when they are not aligned.
    
    For the destination address alignment there is no problem.
    The buffers receiving the data (final location) are aligned
    at least on double word address.
    
    The problem comes from the incoming TCP byte stream
    where the iSCSI headers and data are not aligned on a 8 bytes
    multiple.
    [8 bytes] alignment here means: assuming that the first byte
    of the first PDU received on the connection is numbered 0, be
    "8 bytes aligned" means be (in the TCP payload) at an offset
    that is a mutiple of 8 bytes from the first byte of payload
    received on the connection.
    
    If the iSCSI headers and data are eight bytes aligned,
    the driver/adapter can be programmed to get the beginning
    of the iSCSI headers and data DMAed in memory on a double word
    boundary. Hence the copy can be 8bytes/cycle.
    
    
    To "align" it must be added the following rules
    to the specification:
    
    - the first byte of each iSCSI PDU header must be aligned on
      a multiple of 8 bytes (since the first PDU). This guarantees
      that the header are aligned and most of the data are be aligned.
      Because the data PDU and the SCSI response have a fixed size
     (48 bytes) header.
    
    - In the case of immediate data, with a command/parameters > 16 bytes
      the immediate data may not be aligned. May be we need another rule
      specifying that in this case the data must be aligned on 8 bytes multiple.
      I am not sure this rule is worth the gain.
    
    
    The first rule does change nothing in the specification,
    except that one line specifying the rule  must be added.
    The transmitter needs only to pad (with anything) up to
    next 8 byte multiple before sending the next PDU.
    The receiver when at the end of a PDU, needs only to jump to
    next 8 byte multiple before interpreting the byte stream.
    
    
    This modification is cheap and save a lot of cycles in the
    "software implementations".
    
    
    
    Regards,
    
    Pierre
    
    


Home

Last updated: Tue Sep 04 01:06:10 2001
6315 messages in chronological order