SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI: DataRN



    
    
    some comments in text - Julo
    
    Santosh Rao <santoshr@cup.hp.com> on 22/01/2001 19:20:24
    
    Please respond to Santosh Rao <santoshr@cup.hp.com>
    
    To:   Black_David@emc.com
    cc:   santoshr@cup.hp.com, Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
    Subject:  Re: iSCSI: DataRN
    
    
    
    
    >
    > > The iSCSI draft should refrain from advocating any retry policies at
    the
    > > iSCSI layer, such as being currently done for connection failures and
    > > digest errors.
    >
    > If the above had said "at the SCSI layer", I'd agree.  The issue being
    > addressed here is hiding failure of a connection in an iSCSI session from
    > SCSI (i.e., transparently recover at the iSCSI layer from failure of an
    > iSCSI
    > connection). The following discussion of retry is correct for SCSI, but
    > doesn't
    > apply to iSCSI because iSCSI will deliver the retried command to the SCSI
    > layer at the device at most once, and hence the described problem caused
    > by the command being executed at the device twice can't happen.
    
    David,
    
    The above is correct if the digest error or connection failure occurred on
    delivery of the command. If a digest error were to be detected by an
    initiator on the response PDU (by which time the target has already
    completed the operation and the TCP layer at the initiator has already
    sent the ACK), then, the command is complete from the device perspective
    and should not be retried.
    
    <js> how would that happen ? </js>
    
    Similarly, if the command had completed at the target and the Response PDU
    in transit was affected by a connection failure, retries should not be
    performed by the initiator.
    
    <js> that is basically what StatSN is there to help to. The only thing the
    target has to do is resend the status is the ExpStatSN is not advancing.
    If there is no new command coming to the target - the later may want to
    enquire through a NOP before getting rid of the status </js>
    
    Rather than distinguish these corner cases in its retry policies,
    iSCSI should refrain from advocating retries at its layer.
    Connection failures can be considered as a gross error and recovery can
    be performed at the SCSI ULP.
    
    <js> That argument was heard far in the past and a consensus was reached
    that recovery at the iSCSI level for many of the transport errors would be
    valuable.  If you read carefully the draft you will also see that
    implementations have a large degree of freedom in implementing recovery.
    The only open question, IMHO, is if we want to define classes (or profiles)
    of recovery - and I would say that given that the recovery is so basic
    classes are not worth the effort </js>
    
    <js> On basic issues it will also help the discussion if you get some
    "history context" from one of your colleagues or read some of the archived
    mail - or both </js>
    
    Regards,
    Santosh
    >
    > > I/O retry policies are decided by the SCSI ULP based on the class of
    the
    > > target. (disk, tape, changer, etc).
    > > For a tape class of device, the SCSI ULP may not wish to retry on an
    error
    > > [without a prior rewind operation]. In such situations, iSCSI
    attempting
    > to
    > > retry on connection failures or digest errors can result in problems
    with
    > > sequential access type of media.
    >
    > OTOH, I'm sympathetic to the argument that it's up to an iSCSI
    > implementation
    > to decide how aggressively to recover a failed connection - notice that
    SCSI
    > works quite happily over Fibre Channel where any individual command can
    be
    > dropped without losing the session, and there's no retry (although FCP-2
    has
    > had to do some things for tape).  The mechanisms needed for transparent
    > recovery should be documented, and then we can figure out if they are
    MUST,
    > SHOULD, or MAY implement.  Getting back to the original point - the
    reason
    > for
    > dropping DataRN is that the gain from the optimization it provides for
    this
    > sort
    > of recovery situation doesn't seem to justify the added complexity.
    >
    > --David
    > ---------------------------------------------------
    > David L. Black, Senior Technologist
    > EMC Corporation, 42 South St., Hopkinton, MA  01748
    > +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
    > black_david@emc.com       Mobile: +1 (978) 394-7754
    > ---------------------------------------------------
    >
    >
    
    
    --
    #################################
    Santosh Rao
    Software Design Engineer,
    HP, Cupertino.
    email : santoshr@cup.hp.com
    Phone : 408-447-3751
    #################################
    
    
    
    


Home

Last updated: Tue Sep 04 01:05:46 2001
6315 messages in chronological order