SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: iSCSI: CmdSN and Retry



    Julo wrote:
    > Except for sending the status - an executing command helds-up the LU queue
    > and makes the "local" recovery simpler than clearing the LU queue and
    > resending the commands.
    Santosh wrote:
    > This is correct, in the case where the target does NOT implement
    > data/status recovery. i.e. Assume ordering is required, 2 commands , say,
    > 1 & 2, executed in order at the target. Now, if 1 encountered a digest
    error
    > or format error at the initiator, and was re-sent with the "retry" bit
    > AND the target were to NOT implement data/status recovery, it would result
    > in target executing 1, 2, 1. This may be a problem and canot be addressed,
    > unless iSCSI mandates data/status recovery.
    
    I certainly understand the need of doing data/status recovery and the
    argument of "local" recovery being simpler.  However, this comes with heavy
    cost of performance when pipelined design demands 100,000 IOs per second on
    a network with long delay.  For services and responses happening a few times
    per second, it is OK to hold on the resources until we are certain the ACK
    is returned.  However, in the example above, after completing command 1 if a
    target can't start command 2 until the status for 1 is ACK'ed, the wait can
    be 100 milliseconds on a network with long delay.  The wait make it
    impossible to have large number of IOs in the pipeline. By mandating
    data/status recovery in iSCSI, we change the pipelined command execution to
    interlock handshakes.  As I have said in the previous email, an initiator
    will never send an command which depends on the success of a previous
    command.  This fact makes the pipeline execution in a target possible.
    
    On a separate note, I really respect Santosh's fine-tooth analysis of the
    iSCSI draft.  But, in his arguments the fact that SCSI has been functional
    for the last 20 years was badly ignored.  The CmdSN, DataSN, and StatSN
    allow iSCSI to detect missing PDUs and to quickly ask for retransmit.  They
    should not be used to enforce sequentiality to slow things down.  SCSI
    already has the semantics of ordered execution that requires the help of
    CmdSN when multiple TCP connections are used. However, using StatSN to
    mandate data/status retry pays a great performance price. Both overlapped
    and out-of-order data transfers are allowed in SCSI (Check out the Modify
    Data Pointer extended message).  SCSI works fine without mandating
    non-overlapping transfers or data/status recovery.  Retry can be done in a
    simple and clean manner without introducing complicated semantics for CmdSN,
    DataSN, and StatSN.  Note, if we must retry more than once in a million IOs,
    something is wrong of the infrastructure.  Therefore, let the pipeline flow
    quickly and don't optimize the retry.
    
    As long as we separate the TCP, iSCSI, and SCSI ULP layers cleanly -- for
    which this WG has done a good job -- SCSI will continue to work.  Without
    wasting more bandwidth on this subject, I will be willing to discuss the
    SCSI retry implementations with anyone offline.
    
    


Home

Last updated: Tue Sep 04 01:05:36 2001
6315 messages in chronological order