SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: ISCSI: Urgent Flag requirement violates TCP.



    Dave Black wrote:
    > It is not clear to me that the Urgent feature is "required for
    > interoperation or to limit behavior which has potential for
    > causing harm". I'm prepared to be convinced otherwise, and would
    > like to hear from implementers other than Matt on this subject,
    > and specifically comments on his statement that:
    >    "... high speed implementations will require framing in order
    > 	to prevent a massive amount of buffer resources to 'buffer up' TCP
    > 	segments that arrive after a dropped TCP segment."
    
    My apology for taking so long to respond to this request.
    
    I support Matt whole heartedly.  A TCP-Offload-Engine (TOE), a hardware-aid
    TCP implementation with zero-copy function, is essential for Gigabit-plus
    Ethernet and Fibre Channel and InfiniBand adapters supporting iSCSI over
    TCP.  Using urgent bit to identify the beginning of an iSCSI message enables
    the TOE adapter to parse an incoming TCP/IP segment quickly and deals with
    out-of-order and duplicated frames efficiently. Most arguments against
    Matt's position were based on existing software TCP implementation.  While
    supporting TCP 100%, the TOE adapter does require some changes to the TCP
    implementation at installation. The changes are necessary to enable the
    zero-copy function.  However, an TOE adapter with its hardware and software
    will inter-operate with any existing TCP implementation on any client or
    server by following the TCP spec.
    
    An TOE is a multi-function adapter that supports both TCP/IP and iSCSI.  The
    NFS implementation with UDP or TCP over IP can be supported by a
    scatter/gather DMA list which splits the TCP/IP header from the data payload
    such that the latter is copied directly from and to the NFS cache buffers.
    This is the essence of the zero-copy function. To deal with out-of-order and
    duplicated frames, the TOE adapter works with one IP packet at a time with a
    score card that tracks all incoming segments.  The maximum IP packet size is
    65K.  Some implementations prefer "jumbo" packets or frames.  Inside each IP
    packet, the TOE adapter finds the UDP/TCP headers.
    
    For iSCSI support, the TOE adapter receives its SCSI requests directly from
    the iSCSI driver instead from a TCP/IP driver.  To avoid duplicating the TCP
    run time parameters, the "OPTIONAL" second connection will be used for SCSI
    commands while another TCP connection will be used for login, logout, and
    other task administrative matters.  Needless to say, there are many
    concurrent sessions to many different targets or initiators.  Hundreds or
    thousands of concurrent SCSI commands to different TCP endpoints are
    implemented by an exchange table as I have discussed before this posting.
    The adapter must move the incoming or outgoing TCP segments at the speed of
    the media and generate ACK's (or SACK's) quickly.   As a target, the TOE
    adapter passes incoming SCSI commands directly back to the waiting
    application software like RAID or tape or JBOD storage devices.  For
    outgoing frames or packets, the TOE adapter creates TCP segments with IP
    headers.  It may bundle several iSCSI PDUs into one TCP/IP segment destined
    for the same target.  For incoming frames, the TOE adapter must know the
    iSCSI message boundary.  This is why the urgent bit is extremely useful.
    Without it, the TOE adapter must buffer the whole IP packet before it can
    process an iSCSI header.  While an TOE adapter can deal with a 65K IP packet
    with ease, the "jumbo" frame places a large SRAM demand on the adapter.
    Several incoming packets from different sources just aggregate the SRAM
    requirements.  For iSCSI, jumbo frames are very useful for clients or
    servers thousands miles away.  The TOE adapter must deal with hundreds of
    jumbo frames inflight.  We are dealing with many gigantic TCP windows from
    many connections.
    
    Many objections of the "MUST" word were based on out-of-order delivery of
    SCSI commands.  For outgoing SCSI commands, the TOE adapter will deliver
    them in the same order received from its iSCSI driver.  There is no problem
    here.  For incoming SCSI commands, while honoring the TCP sequence numbers,
    an TOE adapter operates in the same manner as current SCSI, 1394, and fibre
    channel adapters, meaning, no guarantee to in-order command execution.
    
    To illustrate this, lets use an example of command A being followed by
    command B closely.  For a SCSI adapter, if a target gets the command A with
    bus parity check, it would return check status to command A and proceed to
    accept B happily, even there is dependency between command A and B.  For an
    initiator device, the check status on A never blocks the delivery of B.
    This out-of-order deliver is OK because if B depended on A, all file system
    software would hold command B until the completion of command A.  For a 1394
    adapter commands A and B are stored in two ORB's.  A target 1394 device will
    fetch the ORB's.  Again, after encountering error in fetching the ORB for A,
    a target device will proceed to fetch the ORB for B.  For a fibre channel
    adapter, an initiator will send commands A and B in two separate FCP_CMD
    frames.  If frame A arrives with bad CRC, a target device simply throws the
    frame away and proceeds with execution of command B, if it is arrived with
    good CRC.
    
    Therefore, as Matt has stated, it is OK to deliver command B even if the
    command A segment is still missing.  The urgent bit allows us to do that.
    One objection to out-of-order delivery uses the aborting of a non-existing
    command as an example.  But, abort is never deterministic.  An aborted
    command may be either non-existence or already completed.  This theme is
    known to all adapters today.
    
    I do believe the iSCSI WG should facilitate the implementation of TOE
    adapters as well as accommodate the traditional TCP implementation.
    However, in either cases, no TCP changes.  (Personally, I would like to see
    some changes.  But, I learned my lessons earlier on this. :-))
    
    Y.P. Cheng, CTO, ConnectCom Solutions Corp.
    
    


Home

Last updated: Tue Sep 04 01:06:26 2001
6315 messages in chronological order