SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI: CmdSN and Retry



    YP,
    
    Further discussion below.
    
    Regards,
    Santosh
    
    Y P Cheng wrote:
    
    > A few points herein would help reduce the confusion of ordered delivery
    > using CmdSN and error retry.
    >
    > If the initiator sends both commands A and B to a target and requires B to
    > follow A.  After completing A with an error condition, nothing prevents the
    > target to start B immediately.  Therefore, what do we accomplishing by
    > retrying A?
    
    The goal of the retry is to complete A. Not having a policy of retries will
    result in flaky I/Os with failures being propagated to the user application.
    Almost every storage stack does have SOME form of retries on failures, unless
    the failure is clearly a fatal problem and retry is known not to help.
    
    If your concern is regarding the re-ordering of execution caused by the retry,
    then, specific scenarios need to be discussed :
    a) Failure of A at the target [if A is part of an ordered set of commands being
    processed], should be accompanied by the target raising a CHECK CONDITION and
    entering a state of ACA. This will then involve complex initiator recovery to
    restore the order of I/Os being executed. None of this behaviour exists in
    most/all SCSI stacks today , and that is one of the reasons ordering is
    enforced by the layer above SCSI ULP.
    
    b) If the failure occurred while initiator was parsing the response sent to A
    [ex: format error, digest error], this is NOT a problem IF the target
    implements data/status recovery. IOW, such a problem will not result in
    re-ordering of execution since the original execution completed in the order
    desired, and the "retry" only returns data from the iSCSI layer's buffers.
    
    However, if the failure was detected at the initiator AND the target does NOT
    implement data/status recovery, then, the ordering is truly messed up.
    
    
    > In every SCSI implementation I know of, initiator is always
    > ultimately responsible of understanding the ordered execution as well as
    > retry.
    
    When you say initiator, do you mean the SCSI transport layer or the SCSI ULP
    or the layer above the ULP ? Ordered execution is typically managed by the
    layer above the SCSI ULP. The retry policy is handled by the SCSI ULP and could
    be complemented by the scsi transport if it does internal retries.
    
    > Therefore, the retry corner cases pointed by Santosh simply will not
    > exist with properly behaved initiator, unless an iSCSI initiator will behave
    > differently.
    
    Not sure which corner cases you are referring to. (Several have been pointed
    out.) Can you please state specific scenarios to illustrate which corner case
    is not likely to occur ?
    
    > Since TCP ensures ordered delivery, for an iSCSI session with
    > multiple TCP connections, all we need to do is to ensure the sequentiality
    > of CmdSN from multiple connections.
    
    Ideally, if all the traffic were simple task tag based traffic, even the above
    would not be necessary. If the traffic DID contain ordered tags I/Os [and the
    appln may be assuming end-to-end ordering], the above by itself is not
    sufficient. If the ordering did get messed up due to a failure either at the
    target or initiator, then, complex recovery schemes need to be resorted to, and
    in some cases, that may not help either.
    
    
    > I am not aware of any SCSI target allocates resource for status phase.  Once
    > the status is sent, all resources are released.  If the initiator times out
    > the status, it retries the whole command.
    
    What is the defn. of "sent" ? The target cannot release its resources until TCP
    ACK for all the octets of the Status PDU have been received. Since this mapping
    b/n TCP ACKs to an entire Status PDU is difficult to achieve, it must hold onto
    its resources until StatSN ACK is received.
    
    
    > A header digest error is same as a missing PDU except it is detect by iSCSI,
    > not TCP.  Because TCP has delivered the segment, it is possible for the
    > receiver to quickly notify the sender to resend the erroneous header.
    
    How ? The initiator cannot request a re-send without having to specify I.T.T &
    DataSN in the request to re-send [+ optionally, CmdSN & T.T.T.] . With a header
    digest error, the I.T.T. in the PDU itself is un-trustworthy. Recovery in such
    cases cannot safely be done at a PDU level and must be performed at a command
    level.
    
    
    > In conclusion, if CmdSN is enforced, a target must take the TCP transport
    > delivery sequentially whether there is one or more TCP connections because
    > the missing one could just be an ordered-queue or head-of-queue. For a
    > header digest error, a target can't proceed until it gets the missing header
    > as long as CmdSN is non-zero.
    
    The above is obvious, since the missing CmdSN causes a hole in the CmdSN
    causing the target to stall all further processing of CmdSNs behind the missing
    one.
    
    
    >  Since a target will always move to next
    > command as soon as it completes one with or without error, all application
    > software and initiators should know better not to send a SCSI request which
    > depends on the success of a previous request.
    
    This is correct, in the case where the target does NOT implement data/status
    recovery. i.e. Assume ordering is required, 2 commands , say, 1 & 2, executed
    in order at the target. Now, if 1 encountered a digest error or format error at
    the initiator, and was re-sent with the "retry" bit AND the target were to NOT
    implement data/status recovery, it would result in target executing 1, 2, 1.
    This may be a problem and canot be addressed, unless iSCSI mandates data/status
    recovery.
    
    
    
    begin:vcard 
    n:Rao;Santosh 
    tel;work:408-447-3751
    x-mozilla-html:FALSE
    org:Hewlett Packard, Cupertino.;SISL
    adr:;;19420, Homestead Road, M\S 43LN,	;Cupertino.;CA.;95014.;USA.
    version:2.1
    email;internet:santoshr@cup.hp.com
    title:Software Design Engineer
    x-mozilla-cpt:;21088
    fn:Santosh Rao
    end:vcard
    


Home

Last updated: Tue Sep 04 01:05:36 2001
6315 messages in chronological order