SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: iSCSI: CmdSN and Retry



    Y.P.
    
    Any TCP bandwidth is limited to .93 MSS/(RTT * loss_rate^-2) so your
    assumption of one in a million at 100 milli-seconds limits at Fast Ethernet
    rates.  There will be significant overhead on long pipes dealing with TCP
    handshakes and any additional handshake will not add significantly to this
    resource overhead.  The more immediate these handshakes, the less resources
    consumed with respect to any needed replays to satisfy the transport.  I do
    not agree procrastination that depends on rediscovery of indexes unrelated
    to these handshakes to be more effective in allowing pipe-line execution.
    Unless the transport maintains order, there will be a requirement to wait
    for each command to complete to assure the sequence.  This makes compliance
    to SCSI impossible and not a means of improving the pipe-line.  An immediate
    handshake within the transport layer dealing with digest errors is the best
    means of improving performance.  Reliance on the SCSI tag to recover from a
    failure deduced from a dropped transport sequence is not a clean separation
    of layers.  The amount of resources saved by deleting this positive
    relationship is dwarfed by the window size required for such a fat pipe.
    
    Doug
    
    Should each IO represent
    
    > Julo wrote:
    > > Except for sending the status - an executing command helds-up
    > the LU queue
    > > and makes the "local" recovery simpler than clearing the LU queue and
    > > resending the commands.
    > Santosh wrote:
    > > This is correct, in the case where the target does NOT implement
    > > data/status recovery. i.e. Assume ordering is required, 2
    > commands , say,
    > > 1 & 2, executed in order at the target. Now, if 1 encountered a digest
    > error
    > > or format error at the initiator, and was re-sent with the "retry" bit
    > > AND the target were to NOT implement data/status recovery, it
    > would result
    > > in target executing 1, 2, 1. This may be a problem and canot be
    > addressed,
    > > unless iSCSI mandates data/status recovery.
    >
    > I certainly understand the need of doing data/status recovery and the
    > argument of "local" recovery being simpler.  However, this comes
    > with heavy
    > cost of performance when pipelined design demands 100,000 IOs per
    > second on
    > a network with long delay.  For services and responses happening
    > a few times
    > per second, it is OK to hold on the resources until we are certain the ACK
    > is returned.  However, in the example above, after completing
    > command 1 if a
    > target can't start command 2 until the status for 1 is ACK'ed,
    > the wait can
    > be 100 milliseconds on a network with long delay.  The wait make it
    > impossible to have large number of IOs in the pipeline. By mandating
    > data/status recovery in iSCSI, we change the pipelined command
    > execution to
    > interlock handshakes.  As I have said in the previous email, an initiator
    > will never send an command which depends on the success of a previous
    > command.  This fact makes the pipeline execution in a target possible.
    >
    > On a separate note, I really respect Santosh's fine-tooth analysis of the
    > iSCSI draft.  But, in his arguments the fact that SCSI has been functional
    > for the last 20 years was badly ignored.  The CmdSN, DataSN, and StatSN
    > allow iSCSI to detect missing PDUs and to quickly ask for
    > retransmit.  They
    > should not be used to enforce sequentiality to slow things down.  SCSI
    > already has the semantics of ordered execution that requires the help of
    > CmdSN when multiple TCP connections are used. However, using StatSN to
    > mandate data/status retry pays a great performance price. Both overlapped
    > and out-of-order data transfers are allowed in SCSI (Check out the Modify
    > Data Pointer extended message).  SCSI works fine without mandating
    > non-overlapping transfers or data/status recovery.  Retry can be done in a
    > simple and clean manner without introducing complicated semantics
    > for CmdSN,
    > DataSN, and StatSN.  Note, if we must retry more than once in a
    > million IOs,
    > something is wrong of the infrastructure.  Therefore, let the
    > pipeline flow
    > quickly and don't optimize the retry.
    >
    > As long as we separate the TCP, iSCSI, and SCSI ULP layers cleanly -- for
    > which this WG has done a good job -- SCSI will continue to work.  Without
    > wasting more bandwidth on this subject, I will be willing to discuss the
    > SCSI retry implementations with anyone offline.
    >
    >
    
    


Home

Last updated: Tue Sep 04 01:05:36 2001
6315 messages in chronological order