Re: iSCSI: DAP Retry comments

To: "Dave Peterson" <dap@cisco.com>, <Black_David@emc.com>, <ips@ece.cmu.edu>
Subject: Re: iSCSI: DAP Retry comments
From: "Mallikarjun C." <cbm@rose.hp.com>
Date: Tue, 9 Jul 2002 17:07:43 -0700
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;charset="iso-8859-1"
References: <EDEKKDKNBFCABNBAAOBBAEKPELAA.dap@cisco.com>
Sender: owner-ips@ece.cmu.edu

Dave,

I completely agree with you that a well-engineered network goes a long 
way in mitigating the error recovery needs, but let me add a couple of comments.

In your original comment, you said: "the semantics of Retry remain broken 
rendering it useless for tape operation".  Your latest note doesn't address
why you think it's so.  It sounds like you are concerned about possibly lack of
enough time for completing a successful retry at the iSCSI level, but I fail to
see how that's "broken semantics" for the iSCSI protocol definition.  IMHO,
that's an implementation issue (FYI, HP-UX drivers have been successfully
dealing with an identical issue in the FC land for the last several years).

You also are assuming that header digest error is the only event that causes a
command loss, I'd like to add that immediate data digest error is the other.

BTW, it's not true that allegiance reassignment is useful only for multi-connection
sessions.  As I had earlier commented on the list, the ability to continue tasks across 
TCP connection failures is just as useful for single-connection sessions.

Thanks.
--
Mallikarjun

Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions
Hewlett-Packard MS 5668 
Roseville CA 95747
cbm@rose.hp.com


----- Original Message ----- 
From: "Dave Peterson" <dap@cisco.com>
To: <Black_David@emc.com>; <ips@ece.cmu.edu>
Sent: Tuesday, July 09, 2002 9:22 AM
Subject: RE: iSCSI: DAP Retry comments


> Regarding Retry, it's not about the command executing twice.
> Below is a rehash from previous emails of the issues with Retry:
> *****
> Example scenario:
> 1. tape locate command is issued with a 10 second timer
> 2. tape command is dropped at the target due to a header digest error
> 3. having seen no response for the command after an iSCSI initiator
> determined timeout value, the initiator decides to retry the command
> 
> This example leaves a 2 second window for the response to the second command
> to arrive before a ULP abort is sent.
> 
> A mechanism to determine whether or not the command arrived at the target
> would be beneficial for the retry functionality to be useful.
> The mechanism should be initiated early enough in the ULP timeout window to
> allow the iSCSI retried command the opportunity to complete.
> 
> Some options:
> 
> A. If the initiator issues a NOP-Out(immed=1), the target can send back the
> expected CmdSN.
> 
> B. The target, upon a header digest error, sends back a reject or NOP-In
> with
> the expected CmdSN. This would provide an indication to the initiator that
> something happened and trigger the command retry, still hopefully within the
> ULP timeout to allow for sucessful command completion.
> 
> C. Texting stating that an iSCSI initiator should (only) perform a command
> retry when sufficient time remains for the command to completed. But, this
> leads one down the path of device type specific behavior.
> 
> D. Remove the command retry functionality. I have yet to see it actually
> being used. The use of command retry is really only applicable when header
> digests are being used. Given TCP, probability of header digest usage and
> occurance,
> and existing ULP tools for error detection and recovery, my preference is to
> remove it.
> ****
> 
> My point is the following need to occur for the Retry functionality to work:
> 1. header digest must be enabled
> 2. a header digest must be detected
> 3. the retried command must be issued early enough in the ULP timeout window
> to allow completion
> 
> I don't see the Retry functionality being useful for disk devices and
> questionable for tape.
> Using a well-engineered (TCP/IP) network along with hopefully available SCSI
> level tools is a better solution.
> 
> But I don't want to hold up the spec over this matter either, we just won't
> use it.
> 
> Regarding SNACK, I don't really have a problem with it other than again a
> well-engineered network should mitigate the need.
> Regarding connection allegiance reassignment, the functionality is a bit
> more useful (if one supports more than 1 connection per session), provided
> the reassignment completes in time to allow the command to complete within
> the ULP timeout.
> 
> Regarding legacy tape devices, I expect them to be front ended by a gateway.
> These devices will most likely not support any type of transport level error
> detection and recovery making them incompatible/problem childs with respect
> to the various iSCSI error recovery mechanisms.
> 
> Bottom line is engineer your network well and leave the error detection and
> recovery to the SCSI level and above...dap
> 
> > -----Original Message-----
> > From: Black_David@emc.com [mailto:Black_David@emc.com]
> > Sent: Tuesday, July 09, 2002 1:51 AM
> > To: dap@cisco.com; ips@ece.cmu.edu
> > Subject: iSCSI: DAP Retry comments
> >
> >
> > > T p 103 6.1.1 Usage of Retry and 6.7 SCSI Timeouts: the semantics of
> > Retry
> > > remain broken rendering it useless for tape operation. SCSI level error
> > > detection and recovery is the preferred mechanism. Refer to previous
> > emails
> > > sent via the IPS reflector regarding this matter.
> >
> > Can you provide more information?  Command retry *never* results in
> > the command executing twice - both the original command and the retry
> > have the same CmdSN, so the second one is dropped as a duplicate if
> > the first one was received correctly.  6.1.1 is very clear that retry
> > MUST NOT be used if the command was received successfully (acknowledged
> > by ExpCmdSN), and if it is used, the retried command PDU is silently
> > dropped.
> > iSCSI's ordered delivery requirement avoids the situation in which a
> > dropped command causes subsequent commands to mis-execute - if none
> > of the commands are marked for immediate delivery, iSCSI will stop
> > at the "hole" created by the dropped command, and wait for the retry
> > to plug the hole.
> >
> > > T p 128 8.6 Considerations for State-dependent devices: last
> > paragraph:
> > > don't agree with the statement that error recovery at the iSCSI level
> > > (specifically Retry in its current state) is advisable. Retry
> > at the SCSI
> > > level is feasible and is not difficult (i.e., READ POSITION and LOCATE
> > > commands). This paragraph should be removed.
> >
> > Two questions:
> > - What about the SNACK and allegiance change mechanisms?
> > - What about the "legacy" tape devices (e.g., as discussed in London)
> > that presumably don't implement those commands?  I believe this
> > text was originally intended to address this class of devices.
> >
> > Thanks,
> > --David
> > ---------------------------------------------------
> > David L. Black, Senior Technologist
> > EMC Corporation, 42 South St., Hopkinton, MA  01748
> > +1 (508) 249-6449            FAX: +1 (508) 497-8018
> > black_david@emc.com       Mobile: +1 (978) 394-7754
> > ---------------------------------------------------
> 
>

References:
- RE: iSCSI: DAP Retry comments
  - From: "Dave Peterson" <dap@cisco.com>

Prev by Date: Re: iSCSI: DLB [T.31]
Next by Date: Re: iSCSI: DLB's Last Call T15 comment
Prev by thread: RE: iSCSI: DAP Retry comments
Next by thread: RE: iSCSI: DAP Retry comments
Index(es):
- Date
- Thread

Home

Last updated: Wed Jul 10 04:19:02 2002
11228 messages in chronological order