Re: iSCSI: DLB-T.30 (using SNACK w/ PDU size changes)

To: <Black_David@emc.com>, <ips@ece.cmu.edu>
Subject: Re: iSCSI: DLB-T.30 (using SNACK w/ PDU size changes)
From: "Mallikarjun C." <cbm@rose.hp.com>
Date: Wed, 10 Jul 2002 11:41:42 -0700
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;charset="iso-8859-1"
References: <277DD60FB639D511AC0400B0D068B71E05D42461@CORPMX14>
Sender: owner-ips@ece.cmu.edu
David,

Sorry that I would not be attending the Yokohama meetings, so let me 
attempt to describe my thoughts here.

>This appears to require receiving the status
> PDU, processing it, and retroactively dropping it when discovering
> that not all the data has arrived.  That doesn't strike me as a
> good design (e.g., it prohibits an acknowledge via ExpStatSN before
> the ExpDataSN processing).

Your description of the sequence of events is correct, but is not newly 
caused by my proposal.  A SNACK-capable iSCSI layer must parse
the status (including the ExpDataSN) before deciding to ack the status.  
You *cannot* issue a data SNACK for a task you had already ack'ed the 
status for!  All my proposal is saying is: if there's a need to issue a data 
SNACK for this task, drop the status PDU as well if there was a 
MaxRecvDataSegmentLength change during the life of the task.

[ Julian, I don't see the "SNACK only before status ack" idea stated in 
  the draft.  I'd suggest doing so in 9.16. ]

> There's also at least one weird corner case in here - suppose
> the initiator issues the Data SNACK and then changes
> MaxRecvDataSegmentLength before all the data from the SNACK has
> shown up.  Now the initiator would have to retroactively drop a status
> that it already acknowledged via ExpStatSN. 

This whole description again assumes that initiator can ack the status,
and then issue a data SNACK at its convenience for the task.  That is
incorrect.

Changing the MaxRecvDataSegmentLength during the Data-In PDU arrival
can be handled identically always - whether the data burst is a result of 
a prior data SNACK, or is a "regular" burst in response to a read command.
The first status PDU must always be dropped after a MaxRecvDataSegmentLength
change, if ever a data SNACK is employed for the task.

While I agree that last-minute changes could cause havoc, I remember discussing
about this issue with Julian several months ago.  This feature in general is described
from rev09 onwards.  (I know that it's not an excuse for getting it wrong, but 
let's hope to fix it during the Last Call.)

To summarize, I still don't see the need for a new type of Data SNACK.

Thanks.
--
Mallikarjun

Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions
Hewlett-Packard MS 5668 
Roseville CA 95747
cbm@rose.hp.com


----- Original Message ----- 
From: <Black_David@emc.com>
To: <cbm@rose.hp.com>; <ips@ece.cmu.edu>
Sent: Wednesday, July 10, 2002 12:56 AM
Subject: RE: iSCSI: DLB-T.30 (using SNACK w/ PDU size changes)


> > You're right that there's a duplicate StatSN issue with the rev14 text.
> > 
> > I believe that can be addressed by mandating that the initiators must 
> > discard the first status PDU always for the task in question, and then
> issue
> > a follow-on status SNACK.  It may occasionally lead to discarding a good 
> > status (with the right ExpDataSN that reflects the re-segmented DataSN 
> > count), but that would anyway be recovered with the explicit follow-on
> status 
> > SNACK.
> > 
> > I don't believe we need to make BegRun=0 for these cases, nor do I believe
> > that we need a new type of data SNACK.
> 
> Mallikarjun,
> 
> I don't think your proposal works very well.  One problem is that
> the initiator may not know how many data PDUs it should have received
> (and hence know whether it's missing some and needs to issue a Data SNACK)
> until it processes the status PDU (e.g., suppose the last Data-In PDU
> doesn't show up).  This appears to require receiving the status
> PDU, processing it, and retroactively dropping it when discovering
> that not all the data has arrived.  That doesn't strike me as a
> good design (e.g., it prohibits an acknowledge via ExpStatSN before
> the ExpDataSN processing).
> 
> There's also at least one weird corner case in here - suppose
> the initiator issues the Data SNACK and then changes
> MaxRecvDataSegmentLength before all the data from the SNACK has
> shown up.  Now the initiator would have to retroactively drop a status
> that it already acknowledged via ExpStatSN.  This is bad, and leads
> to one of several poor outcomes:
> 
> - Target fails to retransmit the response because the reused
> StatSN is less than ExpStatSN.  The initiator can't recover
> because it can't issue a status SNACK for StatSN < ExpStatSN.
> - Target ignores ExpStatSN and sends the response anyway.  If
> the response gets corrupted and has to be dropped, the
> initiator again can't issue a status SNACK to recover.
> - Initiator sends a new value of ExpStatSN to the target that is
> less than the target's current ExpStatSN.  That's a
> serious protocol error.
> 
> This is still an ugly corner case for a resegmenting Data SNACK
> because the size change results in the original Data SNACK stopping
> its transmission, and the initiator has to figure out that it needs
> to send a resegmenting Data SNACK - not unreasonable, as the initiator
> did change MaxRecvDataSegmentLength.  The alternative of the
> initiator not realizing the consequences of the size change and
> hence having the extra response show up and complete the wrong
> task courtesy of a reused Initiator Task Tag seems far worse.
> 
> I'm also reminded of John Hufferd's warning that features inserted
> late in a design tend to be the most error prone.  This one (Data
> SNACK in the face of MaxRecvDataSegmentLength change) seems to have
> survived two attempts to fix it, so isolating it to its own separate
> type of Data SNACK is making more and more sense, lest it survive
> more attempts to fix it ... that way a failed fix won't
> break ordinary Data SNACKs.  The fact that there are potentially
> subtle problems here was also behind my suggestion that the only
> acceptable resegmenting Data SNACK should be "Send Everything" -
> this should be a relatively rare case, and hence a brute force
> simple robust design is in order, even if it's inefficient.
> 
> Thanks,
> --David
> 
> 
> > -----Original Message-----
> > From: Mallikarjun C. [mailto:cbm@rose.hp.com]
> > Sent: Tuesday, July 09, 2002 6:49 PM
> > To: ips@ece.cmu.edu
> > Subject: iSCSI: DLB-T.30 (using SNACK w/ PDU size changes)
> > 
> > 
> > David,
> > 
> > You're right that there's a duplicate StatSN issue with the rev14 text.
> > 
> > I believe that can be addressed by mandating that the initiators must 
> > discard the first status PDU always for the task in question, and then
> issue
> > a follow-on status SNACK.  It may occasionally lead to discarding a good 
> > status (with the right ExpDataSN that reflects the re-segmented DataSN 
> > count), but that would anyway be recovered with the explicit follow-on
> status 
> > SNACK.
> > 
> > I don't believe we need to make BegRun=0 for these cases, nor do I believe
> > that we need a new type of data SNACK.   
> > 
> > Mallikarjun
> > 
> > > [T.30] 9.16   SNACK Request
> > > 
> > >    If the initiator MaxRecvDataSegmenTLength changed Data-In PDUs 
> > >    requested with RunLength 0 (meaning all PDUs after this number) may 
> > >    be different from the ones originally sent, in order to reflect 
> > >    changes in MaxRecvDataSegmentLength. Their DataSN starts with the 
> > >    requested number and is increased by 1 for each resent Data-In PDU.
> > >    If DataSN numbers change and a SCSI-Reponse PDU was sent reflecting 
> > >    the DataSN before retransmission it MUST be resent to reflect the new
> 
> > >    numbers.
> > > 
> > > This was discussed on the list, but there are still some problems here:
> > > (1) If the MaxRecvDataSegmentLength has changed, the only valid Data
> > > SNACK is BegRun=0, RunLength=0 (i.e., resend everything).  Attempts
> > > to be more clever than this are an invitation to miscount Data-In
> > > PDUs and cause problems in the initiator.  Targets MUST reject
> > > all other Data SNACK requests in this situation.
> > > (2) The new SCSI-Response PDU needs a new StatSN to avoid the initiator
> > > discarding it as a duplicate.  Section 2.2.2.2 is silent on duplicate
> > > detection for StatSN, but discarding duplicates would be a reasonable
> > > thing for an initiator to do.
> > > (3) The initiator needs some way to know that a new response is coming,
> > > and specifically whether to expect one or two responses.  If it
> > > only expects one and two show up, the initiator could reuse the
> > > Task Tag once all the data arrives causing a race in which the
> > > new response could incorrectly complete an unrelated command
> > > (unlikely, but potentially nasty).
> > > This suggests calling out the <BegRun=0, RunLength=0> Data SNACK as
> having
> > > special behavior:
> > > - It may resegment Data-In PDUs to deal with MaxRecvDataSegmentLength.
> > > All other Data SNACK requests MUST NOT resegment.
> > > - It *always* generates a new SCSI Response due to the possibility
> > > of resegmentation.
> > > That's not a great solution, because if one ever sets <BegRun=0,
> > > RunLength=0> in a Data SNACK, the resulting behavior change is dramatic 
> > > and unexpected.
> > > This leads to the final proposal:
> > > - Specify a new SNACK type code (3) for Resegmenting Data SNACK.  SNACK
> > > Data-In resegmentation is allowed only when this is used.  If
> > > resegmentation would be necessary for a Data SNACK (type 1),
> > > that SNACK MUST be rejected.
> > > - Both BegRun and RunLength MUST be zero for a Resegmenting Data
> > > SNACK, and (unlike reserved fields) these MUST be checked by
> > > the receiver (target).
> > > - A new SCSI Response is always generated as a result of a Resegmenting
> > > Data-In SNACK, and it has its own StatSN number to deal with the
> > > fact that the number of Data-In PDUs may have changed, causing
> > > a change to the ExpDataSN value.  This new response also needs
> > > to be marked to distinguish it from a response that may have
> > > been generated earlier (so the initiator knows to wait for the
> > > new response) - using a bit in the flags field for this seems
> > > wrong, so specifying a new Response code value (0x02 - see 9.4.3)
> > > seems like a reasonable way to accomplish this.
> > > - Data SNACK (type 1) now has consistent behavior - it MUST NOT
> resegment
> > > and MUST NOT generate a new SCSI response, ever.
> > > This approach also has the potentially useful property of making it easy
> > > to yank out the Resegmenting Data SNACK wart if we ever put restrictions
> > > on the interaction of MaxRecvDatasegmentLength and Data SNACKs (yes,
> that's
> > > a hint ... this has gotten messy enough that forbidding Data SNACKs when
> > > MaxRecvDataSegmentLength has changed needs to be considered as a
> possible
> > > alternative).
> > > 
> > > This issue also affects some text in 9.16.3.
> > 
> >   
> > 
>
References:
- RE: iSCSI: DLB-T.30 (using SNACK w/ PDU size changes)
  - From: Black_David@emc.com
Prev by Date: Re: iSCSI: DLB's Comment on SCSI Port Names
Next by Date: Re: iSCSI: DLB-T.26 (response for TASK REASSIGN)
Prev by thread: RE: iSCSI: DLB-T.30 (using SNACK w/ PDU size changes)
Next by thread: RE: iSCSI: DLB-T.30 (using SNACK w/ PDU size changes)
Index(es):
- Date
- Thread
Home
Last updated: Wed Jul 10 15:18:53 2002
11248 messages in chronological order