RE: iSCSI: need for new data SNACK code?

To: cbm@rose.hp.com
Subject: RE: iSCSI: need for new data SNACK code?
From: Black_David@emc.com
Date: Thu, 11 Jul 2002 23:42:04 -0400
Cc: ips@ece.cmu.edu
Content-Type: text/plain;charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Mallikarjun,

> > The new code
> > provides the initiator with a more robust way to detect resegmentation
> > by requiring the initiator to explicitly ask for it.  The initiator
> > can take a simple approach of always starting with the existing
> > Data SNACK code that does not resegment and only using the new code
> > when the non-resegmenting SNACK doesn't work.  
> 
> That's one approach.  So, let's note that you're expecting the target
> to maintain a "PDU-size-changed" flag for every active task, and 
> expecting it to fail the "regular" data SNACK if the flag is set.

The flag per task is not needed - I'd expect the Target to look
at the Data PDUs it would have to resend, check them against the max
Data PDU size for this connection and fail the regular SNACK if any
PDU is too large.  This could even be lazily evaluated at the time
each PDU is to be sent because the odds are that if resegmenting
is necessary, the first Data PDU to be resent is going to need it.
The initiator still has to time out the failure of all the Data
PDUs to arrive in order to deal with header corruption (unfortunately,
the F bit doesn't help) - when it times out the "regular" Data SNACK,
it issues a "resegmenting" one (this deals with resegmenting that
becomes necessary after the first Data PDU for a Data SNACK has
been sent).

> Some issues -
>     a) would it really cover the (impossible, IMHO) case you're attempting
>         to cover, in the face of multiple PDU size changes?

Should work - permission to resegment includes permission to
re-resegment or worse.  Independent of how many size changes
happen, the status SNACK at the end returns a new status with
an ExpDataSN that reflects the right number of Data PDUs sent.
 
>     b) assuming that there indeed is a disconnect b/n the two,
>         what should the target do when a resegmenting data SNACK
>         is received, but there's no PDU size change?  I hope you
>         aren't mandating the specific approach.

Resegmenting SNACK is "permission to resegment", and the target need
not use that permission.  If the permission is not used, the Initiator's
status SNACK is not needed but does no harm.

>     c) this approach costs one additional round-trip delay, where
>         none is necessary (as argued below).

I make no apologies for spending a round trip to remove a data
corruption risk.  This is a rare case with a possibly nasty failure
mode - I'm much more interested in this working right than fast.

>     d) seems like it would need new Reject code(s) to distinguish
>         a "regular" reject from that of the PDU size change ones.

Could be useful, but is not strictly necessary.

> >If the target makes
> > its own choice to resegment, and the initiator doesn't think the
> > target resegmented, 
> 
> Now this is beginning to feel more like the option A vs B vs C debate
> we had a while ago.  If the protocol works correctly, both sides would
> be *completely synchronized* on the fact of PDU size change.

As the complexity of a protocol increases, that synchronized
state machine assumption becomes more prone to failure.  The
whole discussion of default values for text keys and the resulting
"if in doubt, negotiate it" maxim was one example.  The alternative
of relying on every default key value to be what was expected was
significantly less robust.

> There are two options for initiators to deal with this - 
> 
> a) don't issue any data SNACKs while any text negotiation is 
>     in progress - wait till the text response is received successfully.

That strikes me as a productive direction that I could see enforcing
with some "MUST"s / "MUST NOT"s - the initiator is causing this mess
by changing the max Data PDU size on the connection.  This is not a
friendly thing to do, and for an initiator to expects to be able to do
this with uninterrupted high performance is unrealistic ... so imposing
costs on the initiator for making this disruptive size change makes sense.

Suppose we went back to the old approach where Data SNACKs *never*
resegment and required that:
- Initiators MUST NOT issue Data SNACKs that could require
	resegmentation?
- Targets MUST reject or ignore Data SNACKs that require
	resegmentation.
- If resegmentation becomes necessary during retransmission
	of Data PDUs for a Data SNACK, PDUs retransmission
	MUST cease for that Data SNACK.
An initiator that wants to be able to issue a Data SNACK for
some or all of its commands then has to ensure that no such
commands are outstanding when/while it changes (in particular
reduces) the max Data PDU size.  In the limit, the initiator
has to wait for all of its commands on the connection to
complete before changing the max Data PDU size, and not
start any new ones until the size change is complete.

This is simpler and more robust than any of the options under
discussion and has the right sort of incentives in discouraging
initiators from changing the max Data PDU size.

Can you accept this? I would expect widespread support for
the resulting removal of target resegmentation from iSCSI.

I will however answer one more question ...

> > This requires additional Initiator
> > state per command for something that almost never happens, and if it
> > gets one of these markings wrong, 
> 
> Sorry, how is it different from the target getting wrong one 
> of its aforementioned
> "PDU-size-changed" flags for tasks?

(1) The per-task flags aren't needed - see above.
(2) The failure is harmless - if the target fails to resegment and
	sends a PDU that is too large, the initiator discards it,
	and then decides what to do about the broken target.  There's
	no possibility of completing a READ command without all of its
	data.

Thanks,
--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 249-6449            FAX: +1 (508) 497-8018
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------


> -----Original Message-----
> From: Mallikarjun C. [mailto:cbm@rose.hp.com]
> Sent: Thursday, July 11, 2002 7:46 PM
> To: Black_David@emc.com
> Cc: ips@ece.cmu.edu
> Subject: Re: iSCSI: need for new data SNACK code?
> 
> 
> David, comments in text.
> 
> > I disagree with the "careful enough" characterization.  The new code
> > provides the initiator with a more robust way to detect resegmentation
> > by requiring the initiator to explicitly ask for it.  The initiator
> > can take a simple approach of always starting with the existing
> > Data SNACK code that does not resegment and only using the new code
> > when the non-resegmenting SNACK doesn't work.  
> 
> That's one approach.  So, let's note that you're expecting 
> the target to maintain 
> a "PDU-size-changed" flag for every active task, and 
> expecting it to fail the 
> "regular" data SNACK if the flag is set.
> 
> Some issues -
>     a) would it really cover the (impossible, IMHO) case 
> you're attempting
>         to cover, in the face of multiple PDU size changes? 
>     b) assuming that there indeed is a disconnect b/n the 
> two, what should the
>         target do when a resegmenting data SNACK is received, 
> but there's no
>         PDU size change?  I hope you aren't mandating the 
> specific approach.
>     c) this approach costs one additional round-trip delay, 
> where none is
>         necessary (as argued below).
>     d) seems like it would need new Reject code(s) to 
> distinguish a "regular" reject
>         from that of the PDU size change ones.
> 
> >If the target makes
> > its own choice to resegment, and the initiator doesn't think the
> > target resegmented, 
> 
> Now this is beginning to feel more like the option A vs B vs C debate
> we had a while ago.  If the protocol works correctly, both sides would
> be *completely synchronized* on the fact of PDU size change.  
> 
> There are two options for initiators to deal with this - 
> 
> a) don't issue any data SNACKs while any text negotiation is 
> in progress - 
>     wait till the text response is received successfully.
> 
> OR
> 
> b) issue a data SNACK regardless, and if the text response 
> (that indicates 
>     a PDU size change) arrives before the data burst 
> completes, discard the 
>     status PDU, and ask for its retransmission.
> 
> Option a is what I suggest, and b is for the adventurous sort.
> 
> >there are error scenarios that combine this with
> > corrupt Data PDU headers to cause the initiator to successfully
> > complete a SCSI command that has not delivered all its data
> > (the resegmented PDUs caused the Data PDU count to match 
> the ExpDataSN
> > value in the response that should have been discarded, but wasn't).
> 
> Which is precisely why I'm suggesting that we mandate discarding the 
> status PDU.  What am I missing?
> 
> > While these should be rare, their consequences can be catastrophic.
> > 
> > It is conveying the Initiator's instructions that resegmentation is
> > permitted.  I am not comfortable with the last sentence 
> above that assumes
> > that the Initiator and Target will always have identical 
> views of all of
> > the effects of a full feature phase PDU size change - (which is
> > a rare event to begin with, and hence likely to involve code that
> > isn't well exercised/tested).
> 
> Obviously, I cannot guarantee the lack of bugs in any implementation.
> But again, let's not attempt to address implementation bugs 
> by protocol
> means (that's why we picked option A in the A vs B vs C debate I 
> referred to above - see the "reusing ISID for recovery" 
> thread; it's for the 
> same reason we removed the X-bit for connection reinstatement -
> see the "X-bit in Login" thread).
> 
> > 
> > > The only two changes from the rev14 text that I propose 
> are that we add:
> > >
> > >    a) The first status PDU must always be dropped after a
> > > MaxRecvDataSegmentLength change, if ever a data SNACK is
> > > employed for the task.
> > 
> > When does this obligation to drop the first status PDU expire?  
> 
> As it says: when the first status PDU is dropped for the task 
> - for each 
> active task during a PDU size change, *and* for which a data SNACK 
> is/was issued.
> 
> >I think
> > the Initiator has to mark all commands that are outstanding 
> or become
> > outstanding between the time it starts the negotiation that changes
> > MaxRecvDataSegmentLength and the time that it gets the 
> final Text Response
> > of that negotiation from the target.  This requires 
> additional Initiator
> > state per command for something that almost never happens, and if it
> > gets one of these markings wrong, 
> 
> Sorry, how is it different from the target getting wrong one 
> of its aforementioned
> "PDU-size-changed" flags for tasks?
> 
> I believe that the onus should be on the initiator to do what 
> it takes to 
> do the right recovery - as is the general error recovery 
> philosophy everywhere.
> Target cannot predict if the initiator would be interested in 
> recovering a
> particular I/O (regardless of the operational ErrorRecoveryLevel).
> 
> >it's vulnerable to failing to deliver
> > all the data for a SCSI command in a compound error situation.  An
> > alternative with the new code could involve a single bit 
> per connection
> > that records whether the PDU size was ever changed (if so, retry any
> > failed Data SNACK as a resegmenting Data SNACK). 
> 
> Or, use just "the Data SNACK", if we define only one.  I 
> can't see why this
> optimization needs two data SNACK codes.
> 
> > 
> > > Initiator MUST issue a status SNACK to recover the
> > > status PDU (i.e. move the onus of retransmitting
> > > status from the target to the initiator).
> > >     b) A SNACK requesting an R2T, Data or Status PDU for 
> a task MUST be 
> > >           issued before the status for the task is acknowledged.
> > 
> > I have no problem with these two.
> > 
> > > I'll be glad to see any technical reasons that I am 
> > > overlooking, that require two codes.
> > 
> > See above.  This is somewhat analogous to the "if in doubt, 
> negotiate
> > it" principle for login - telling the other side *exactly* 
> what is wanted
> > is more robust than assuming that it will do what is wanted, and in
> > this resegmenting Data SNACK case, there are potentially nasty
> > consequences to an incorrect assumption.  Does this make any sense?
> 
> I see what you're trying to get at.  However, IMHO, there is 
> no "assuming"
> involved here.  If the protocol works right, it should do the 
> right thing.  Or else, 
> we are in serious trouble despite this change.
> 
> Regards.
> --
> Mallikarjun
> 
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions
> Hewlett-Packard MS 5668 
> Roseville CA 95747
> cbm@rose.hp.com
> 
>
Follow-Ups:
- Re: iSCSI: need for new data SNACK code?
  - From: "Mallikarjun C." <cbm@rose.hp.com>
- remove
  - From: "Ron Kao" <ron@vovtel.com>
Prev by Date: remove
Next by Date: RE: iSCSI: need for new data SNACK code?
Prev by thread: Re: iSCSI: need for new data SNACK code?
Next by thread: remove
Index(es):
- Date
- Thread
Home
Last updated: Fri Jul 12 16:18:53 2002
11306 messages in chronological order