SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI : Digest Error Problems & CmdSN/ExpCmdSN window issues



    
    
    Santosh,
    
    The trouble with forbidding a certain behavior is that you have to enforce
    it (i.e.,  check and signal errors for units that do not behave).   Besides
    - the whole philosophy of the SCSI set of protocols is that the target is
    the master and the initiator should let the target decide how to fulfill
    the command.   That is why we chose not to impose restrictions above those
    imposed by SCSI.  The whole set of issues is also raised only because we
    provide also for storage proxies - otherwise a stronger checksum at TCP
    level and recovery at TCP level would have done what we wanted and recovery
    of the type we are dealing now with would have been done at TCP level.
    I am confident that we can reinstate DataSN a simple mean to sequence (not
    ack) data packets and considerably simplify recovery.
    
    And do not forget that raising the error up to ULP with a service response
    will make the recovery far more expensive (as Prasenjit has already stated)
    - far more than current wedge drivers do as these rarely consider commands
    in flight and the need to keep order in a target that is not yet aware that
    something went wrong.
    
    Julo
    
    Santosh Rao <santoshr@cup.hp.com> on 28/01/2001 00:07:08
    
    Please respond to Santosh Rao <santoshr@cup.hp.com>
    
    To:   Julian Satran/Haifa/IBM@IBMIL
    cc:   ips@ece.cmu.edu
    Subject:  Re: iSCSI : Digest Error Problems & CmdSN/ExpCmdSN window issues
    
    
    
    
    Julian,
    
    The missing Data PDU could be detected if the initiator were to
    perform a count check operation upon receiving SCSI Response PDU,
    along the lines of :
    
    no. of bytes xfer'ed =
         (Expected Data Xfer Length) - (Basic Residual Count)
    
    where,
    Expected Data Xfer Length -> as specified in SCSI Command PDU
    Basic Residual Count -> as specified in SCSI Response PDU
    
    However, this is currently not possible due to overlapped data
    transfers being allowed by iSCSI. If iSCSI were to dis-allow
    overlapping data xfer's and initiators used a count check
    [as is done in FC], this would also address the problem.
    
    
    Regards,
    Santosh
    
    >
    >
    >
    > If the header is a data header we can hardly trust the ULP to recognize
    the
    > error (he might be unaware
    > of a missing packet).  With data numbering this situation could have been
    > discovered at "status time".
    > The only thing we could do is restart all commands but this is equivalent
    > to a connection restart for all practical purposes.  Dropping data
    > numbering might have some more "side-effects" like this.
    > As the combination of values - tag, address, offset may stil let some
    > implementations to assume that they have
    > a correct task identifier I don't see a point in mandating a recovery
    > behavior and the implementer may choose to:
    >
    > -retry/restart command
    > -logout drop and rebuild connection login and restart/retry
    > -abort all task sets (practically reset the target!) and report for all
    > commands a "delivery system failure" (kick-in the ULP recovery) and if
    you
    > suspect the link quality rebuild it; this later behavior means also that
    > you have to stop delivering anything on any link  to the target to avoid
    > out of order execution until you have finished the cleanup - pretty
    drastic
    >
    > With data numbering recovery could have stayed within the confines of a
    > command even if a header was bad.
    > Perhaps we should leave the DataSN only as a sequencer so that at
    > status-time the initiator should be able to find if a data packet was
    > dropped (no ExpDataSN on a NOP).
    >
    > Regards,
    > Julo
    >
    >
    >
    >
    > Michael Krause <krause@cup.hp.com> on 27/01/2001 04:59:12
    >
    > Please respond to Michael Krause <krause@cup.hp.com>
    >
    > To:   Julian Satran/Haifa/IBM@IBMIL
    > cc:   ips@ece.cmu.edu
    > Subject:  Re: iSCSI : Digest Error Problems & CmdSN/ExpCmdSN window
    issues
    >
    >
    >
    >
    > At 07:40 PM 1/25/2001 +0200, julian_satran@il.ibm.com wrote:
    >
    >
    > >1) The initiator task tag cannot be trusted when a header digest error
    > >is seen. What does the phrase "provided it can recognize the initiator
    > >task tag" mean ?
    > >How can an initiator reliably claim that the initiator task tag is
    > >trustworthy ?
    > >
    > ><js> an initiator may choose to provide some redundancy in the tag
    itself
    > ></js>
    >
    > I'm aware of some techniques for inserting redundant information in tags
    > which limits the potential error exposure when a multi-bit error occurs,
    > however these are not fail-safe leading to potential incorrect operation
    -
    > perhaps benign in many cases; perhaps not in others. As such, if a header
    > digest error occurs, the PDU should be silently discarded and recovery
    > should be left to the ULP.  There is little to no value having two
    > mechanisms to solve the same problem.
    >
    > Mike
    >
    >
    >
    >
    >
    
    
    --
    #################################
    Santosh Rao
    Software Design Engineer,
    HP, Cupertino.
    email : santoshr@cup.hp.com
    Phone : 408-447-3751
    #################################
    
    
    
    


Home

Last updated: Tue Sep 04 01:05:39 2001
6315 messages in chronological order