SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI: need for new data SNACK code?



    David, comments in text.
    
    > I disagree with the "careful enough" characterization.  The new code
    > provides the initiator with a more robust way to detect resegmentation
    > by requiring the initiator to explicitly ask for it.  The initiator
    > can take a simple approach of always starting with the existing
    > Data SNACK code that does not resegment and only using the new code
    > when the non-resegmenting SNACK doesn't work.  
    
    That's one approach.  So, let's note that you're expecting the target to maintain 
    a "PDU-size-changed" flag for every active task, and expecting it to fail the 
    "regular" data SNACK if the flag is set.
    
    Some issues -
        a) would it really cover the (impossible, IMHO) case you're attempting
            to cover, in the face of multiple PDU size changes? 
        b) assuming that there indeed is a disconnect b/n the two, what should the
            target do when a resegmenting data SNACK is received, but there's no
            PDU size change?  I hope you aren't mandating the specific approach.
        c) this approach costs one additional round-trip delay, where none is
            necessary (as argued below).
        d) seems like it would need new Reject code(s) to distinguish a "regular" reject
            from that of the PDU size change ones.
    
    >If the target makes
    > its own choice to resegment, and the initiator doesn't think the
    > target resegmented, 
    
    Now this is beginning to feel more like the option A vs B vs C debate
    we had a while ago.  If the protocol works correctly, both sides would
    be *completely synchronized* on the fact of PDU size change.  
    
    There are two options for initiators to deal with this - 
    
    a) don't issue any data SNACKs while any text negotiation is in progress - 
        wait till the text response is received successfully.
    
    OR
    
    b) issue a data SNACK regardless, and if the text response (that indicates 
        a PDU size change) arrives before the data burst completes, discard the 
        status PDU, and ask for its retransmission.
    
    Option a is what I suggest, and b is for the adventurous sort.
    
    >there are error scenarios that combine this with
    > corrupt Data PDU headers to cause the initiator to successfully
    > complete a SCSI command that has not delivered all its data
    > (the resegmented PDUs caused the Data PDU count to match the ExpDataSN
    > value in the response that should have been discarded, but wasn't).
    
    Which is precisely why I'm suggesting that we mandate discarding the 
    status PDU.  What am I missing?
    
    > While these should be rare, their consequences can be catastrophic.
    > 
    > It is conveying the Initiator's instructions that resegmentation is
    > permitted.  I am not comfortable with the last sentence above that assumes
    > that the Initiator and Target will always have identical views of all of
    > the effects of a full feature phase PDU size change - (which is
    > a rare event to begin with, and hence likely to involve code that
    > isn't well exercised/tested).
    
    Obviously, I cannot guarantee the lack of bugs in any implementation.
    But again, let's not attempt to address implementation bugs by protocol
    means (that's why we picked option A in the A vs B vs C debate I 
    referred to above - see the "reusing ISID for recovery" thread; it's for the 
    same reason we removed the X-bit for connection reinstatement -
    see the "X-bit in Login" thread).
    
    > 
    > > The only two changes from the rev14 text that I propose are that we add:
    > >
    > >    a) The first status PDU must always be dropped after a
    > > MaxRecvDataSegmentLength change, if ever a data SNACK is
    > > employed for the task.
    > 
    > When does this obligation to drop the first status PDU expire?  
    
    As it says: when the first status PDU is dropped for the task - for each 
    active task during a PDU size change, *and* for which a data SNACK 
    is/was issued.
    
    >I think
    > the Initiator has to mark all commands that are outstanding or become
    > outstanding between the time it starts the negotiation that changes
    > MaxRecvDataSegmentLength and the time that it gets the final Text Response
    > of that negotiation from the target.  This requires additional Initiator
    > state per command for something that almost never happens, and if it
    > gets one of these markings wrong, 
    
    Sorry, how is it different from the target getting wrong one of its aforementioned
    "PDU-size-changed" flags for tasks?
    
    I believe that the onus should be on the initiator to do what it takes to 
    do the right recovery - as is the general error recovery philosophy everywhere.
    Target cannot predict if the initiator would be interested in recovering a
    particular I/O (regardless of the operational ErrorRecoveryLevel).
    
    >it's vulnerable to failing to deliver
    > all the data for a SCSI command in a compound error situation.  An
    > alternative with the new code could involve a single bit per connection
    > that records whether the PDU size was ever changed (if so, retry any
    > failed Data SNACK as a resegmenting Data SNACK). 
    
    Or, use just "the Data SNACK", if we define only one.  I can't see why this
    optimization needs two data SNACK codes.
    
    > 
    > > Initiator MUST issue a status SNACK to recover the
    > > status PDU (i.e. move the onus of retransmitting
    > > status from the target to the initiator).
    > >     b) A SNACK requesting an R2T, Data or Status PDU for a task MUST be 
    > >           issued before the status for the task is acknowledged.
    > 
    > I have no problem with these two.
    > 
    > > I'll be glad to see any technical reasons that I am 
    > > overlooking, that require two codes.
    > 
    > See above.  This is somewhat analogous to the "if in doubt, negotiate
    > it" principle for login - telling the other side *exactly* what is wanted
    > is more robust than assuming that it will do what is wanted, and in
    > this resegmenting Data SNACK case, there are potentially nasty
    > consequences to an incorrect assumption.  Does this make any sense?
    
    I see what you're trying to get at.  However, IMHO, there is no "assuming"
    involved here.  If the protocol works right, it should do the right thing.  Or else, 
    we are in serious trouble despite this change.
    
    Regards.
    --
    Mallikarjun
    
    Mallikarjun Chadalapaka
    Networked Storage Architecture
    Network Storage Solutions
    Hewlett-Packard MS 5668 
    Roseville CA 95747
    cbm@rose.hp.com
    
    
    


Home

Last updated: Thu Jul 11 21:18:53 2002
11286 messages in chronological order