SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI: DAP Retry comments



    Dave,
    
    I completely agree with you that a well-engineered network goes a long 
    way in mitigating the error recovery needs, but let me add a couple of comments.
    
    In your original comment, you said: "the semantics of Retry remain broken 
    rendering it useless for tape operation".  Your latest note doesn't address
    why you think it's so.  It sounds like you are concerned about possibly lack of
    enough time for completing a successful retry at the iSCSI level, but I fail to
    see how that's "broken semantics" for the iSCSI protocol definition.  IMHO,
    that's an implementation issue (FYI, HP-UX drivers have been successfully
    dealing with an identical issue in the FC land for the last several years).
    
    You also are assuming that header digest error is the only event that causes a
    command loss, I'd like to add that immediate data digest error is the other.
    
    BTW, it's not true that allegiance reassignment is useful only for multi-connection
    sessions.  As I had earlier commented on the list, the ability to continue tasks across 
    TCP connection failures is just as useful for single-connection sessions.
    
    Thanks.
    --
    Mallikarjun
    
    Mallikarjun Chadalapaka
    Networked Storage Architecture
    Network Storage Solutions
    Hewlett-Packard MS 5668 
    Roseville CA 95747
    cbm@rose.hp.com
    
    
    ----- Original Message ----- 
    From: "Dave Peterson" <dap@cisco.com>
    To: <Black_David@emc.com>; <ips@ece.cmu.edu>
    Sent: Tuesday, July 09, 2002 9:22 AM
    Subject: RE: iSCSI: DAP Retry comments
    
    
    > Regarding Retry, it's not about the command executing twice.
    > Below is a rehash from previous emails of the issues with Retry:
    > *****
    > Example scenario:
    > 1. tape locate command is issued with a 10 second timer
    > 2. tape command is dropped at the target due to a header digest error
    > 3. having seen no response for the command after an iSCSI initiator
    > determined timeout value, the initiator decides to retry the command
    > 
    > This example leaves a 2 second window for the response to the second command
    > to arrive before a ULP abort is sent.
    > 
    > A mechanism to determine whether or not the command arrived at the target
    > would be beneficial for the retry functionality to be useful.
    > The mechanism should be initiated early enough in the ULP timeout window to
    > allow the iSCSI retried command the opportunity to complete.
    > 
    > Some options:
    > 
    > A. If the initiator issues a NOP-Out(immed=1), the target can send back the
    > expected CmdSN.
    > 
    > B. The target, upon a header digest error, sends back a reject or NOP-In
    > with
    > the expected CmdSN. This would provide an indication to the initiator that
    > something happened and trigger the command retry, still hopefully within the
    > ULP timeout to allow for sucessful command completion.
    > 
    > C. Texting stating that an iSCSI initiator should (only) perform a command
    > retry when sufficient time remains for the command to completed. But, this
    > leads one down the path of device type specific behavior.
    > 
    > D. Remove the command retry functionality. I have yet to see it actually
    > being used. The use of command retry is really only applicable when header
    > digests are being used. Given TCP, probability of header digest usage and
    > occurance,
    > and existing ULP tools for error detection and recovery, my preference is to
    > remove it.
    > ****
    > 
    > My point is the following need to occur for the Retry functionality to work:
    > 1. header digest must be enabled
    > 2. a header digest must be detected
    > 3. the retried command must be issued early enough in the ULP timeout window
    > to allow completion
    > 
    > I don't see the Retry functionality being useful for disk devices and
    > questionable for tape.
    > Using a well-engineered (TCP/IP) network along with hopefully available SCSI
    > level tools is a better solution.
    > 
    > But I don't want to hold up the spec over this matter either, we just won't
    > use it.
    > 
    > Regarding SNACK, I don't really have a problem with it other than again a
    > well-engineered network should mitigate the need.
    > Regarding connection allegiance reassignment, the functionality is a bit
    > more useful (if one supports more than 1 connection per session), provided
    > the reassignment completes in time to allow the command to complete within
    > the ULP timeout.
    > 
    > Regarding legacy tape devices, I expect them to be front ended by a gateway.
    > These devices will most likely not support any type of transport level error
    > detection and recovery making them incompatible/problem childs with respect
    > to the various iSCSI error recovery mechanisms.
    > 
    > Bottom line is engineer your network well and leave the error detection and
    > recovery to the SCSI level and above...dap
    > 
    > > -----Original Message-----
    > > From: Black_David@emc.com [mailto:Black_David@emc.com]
    > > Sent: Tuesday, July 09, 2002 1:51 AM
    > > To: dap@cisco.com; ips@ece.cmu.edu
    > > Subject: iSCSI: DAP Retry comments
    > >
    > >
    > > > T p 103 6.1.1 Usage of Retry and 6.7 SCSI Timeouts: the semantics of
    > > Retry
    > > > remain broken rendering it useless for tape operation. SCSI level error
    > > > detection and recovery is the preferred mechanism. Refer to previous
    > > emails
    > > > sent via the IPS reflector regarding this matter.
    > >
    > > Can you provide more information?  Command retry *never* results in
    > > the command executing twice - both the original command and the retry
    > > have the same CmdSN, so the second one is dropped as a duplicate if
    > > the first one was received correctly.  6.1.1 is very clear that retry
    > > MUST NOT be used if the command was received successfully (acknowledged
    > > by ExpCmdSN), and if it is used, the retried command PDU is silently
    > > dropped.
    > > iSCSI's ordered delivery requirement avoids the situation in which a
    > > dropped command causes subsequent commands to mis-execute - if none
    > > of the commands are marked for immediate delivery, iSCSI will stop
    > > at the "hole" created by the dropped command, and wait for the retry
    > > to plug the hole.
    > >
    > > > T p 128 8.6 Considerations for State-dependent devices: last
    > > paragraph:
    > > > don't agree with the statement that error recovery at the iSCSI level
    > > > (specifically Retry in its current state) is advisable. Retry
    > > at the SCSI
    > > > level is feasible and is not difficult (i.e., READ POSITION and LOCATE
    > > > commands). This paragraph should be removed.
    > >
    > > Two questions:
    > > - What about the SNACK and allegiance change mechanisms?
    > > - What about the "legacy" tape devices (e.g., as discussed in London)
    > > that presumably don't implement those commands?  I believe this
    > > text was originally intended to address this class of devices.
    > >
    > > Thanks,
    > > --David
    > > ---------------------------------------------------
    > > David L. Black, Senior Technologist
    > > EMC Corporation, 42 South St., Hopkinton, MA  01748
    > > +1 (508) 249-6449            FAX: +1 (508) 497-8018
    > > black_david@emc.com       Mobile: +1 (978) 394-7754
    > > ---------------------------------------------------
    > 
    > 
    
    


Home

Last updated: Wed Jul 10 04:19:02 2002
11228 messages in chronological order