SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI : Initiators expected to fake CHECK CONDITIONS.



    
    
    Santosh,
    
    OK - you have convinced me (and I've found the CAM3 list of statuses).  I
    will specify in 04 the relevant values for the service response in line
    with CAM3.
    
    Julo
    
    Santosh Rao <santoshr@cup.hp.com> on 15/01/2001 00:34:36
    
    Please respond to Santosh Rao <santoshr@cup.hp.com>
    
    To:   Julian Satran/Haifa/IBM@IBMIL
    cc:   ips@ece.cmu.edu
    Subject:  Re: iSCSI : Initiators expected to fake CHECK CONDITIONS.
    
    
    
    
    > It would be simple if things where that clear-cut.
    >
    > What is a format error coming from a target?  IMHO it is a target error
    and
    > not a protocol failure.
    
    Julian,
    
    A SCSI Protocol(i.e. iSCSI) header format error constitutes a
    "Service Delivery or Target Failure" service response, since the failure
    occurs in the service delivery sub-system (SCSI Protocol). SAM-2 explains
    this service response as "The command has ended due to a service delivery
    failure or target device mal-function", which should be the
    service response to be returned on a header format error.
    
    Further, SAM-2 Section 5 also states that the application client
    (i.e. the SCSI upper layer driver) shall treat the SCSI Status to be
    un-defined if the command ends with a service response of
    "Service Delivery or Target Failure".
    This implies that the SCSI Upper Layer driver shall ignore the
    CHECK CONDITION scsi status that iSCSI initiators are going to fake to
    their upper layers on a header format error, due to a service response of
    "Service Delivery or Target Mal-function".
    
    So, is iSCSI proposing to return a service response of "Task Complete" to
    its upper layers on a header format error ? If not and it is the intent
    that a service response of "Service Delivery or Target Mal-function"
    should be returned, then, the upper layer scsi driver is going to ignore
    the CHECK CONDITION that iSCSI intends to fake.
    
    It would be interesting to hear what T10 has to say on this approach that
    iSCSI is currently proposing to adopt on a header format error.
    
    I see the following problems with the current approach being advocated :
    ------------------------------------------------------------------------
    
    1) It does NOT solve the problem for commands other than the SCSI Command
    PDU. I am yet to see a response to my questions on how header format
    errors for Login, Logout, Text, NOP-OUT, NOP-IN and SCSI Task Management
    Command PDU are handled by the current proposal described in section 5.4
    of the 03 iSCSI draft.
    
    2) It is not in line with [violates ?] Section 5 of SAM-2.
    
    3) It can cause upper layers to initiate Auto Contingent Allegiance
    recovery such as CLEAR ACA due to the CHECK CONDITION returned.
    
    4) It can cause upper layers to over-react on seeing a HARDWARE ERROR by
    resorting to error recovery such as BDR to recover from a HARDWARE ERROR.
    
    5) It adds extra complexity to the iSCSI initiator drivers by making them
    SPC-2 aware and having to generate sense data on behalf of a target,
    not something that has been required in other SCSI Protocols such as
    parallel scsi and fibre channel.
    
    6) It causes a mis-leading error log of a
    HARDWARE ERROR, where, a more specific error log of the
    header format error that occurred based on the response data would
    have been more useful in quick fault isolation.
    
    7) Last, but not the least, it is a violation of layering between the
    ULP and LLP, wherein, an LLP Service Delivery Failure is being treated as
    a ULP SCSI error returned by the LUN.
    
    I would like to propose the following solution :
    ------------------------------------------------
    1) On discovery of header format error, a target MUST convey the
    specific type of format error that was discovered through use of
    mechanisms like the Response Data in a SCSI Response PDU,
    Response Field in a SCSI Task Management Response PDU
    and the Reason Code field in the REJECT PDU.
    
    2) On a header format error discovered at the initiator,
    a service response of "Service Delivery or Target Failure" MUST be
    returned to their upper layers [and the initiator may log the
    specific header format error that was discovered.]. The draft need not
    attempt to elaborate on these service response definitions since these are
    defined by the SCSI Stack of each O.S. as a set of return values exchanged
    b/n the LLP and the ULP.
    
    Benefits of the proposed solution :
    -----------------------------------
    1) It addresses the problem for header format errors on ALL types of PDUs.
    
    2) It is in line with [complies with ?] SAM-2 semantics of "Execute
    Command".
    
    3) It provides more value-added error logging based on the specific header
    format error that occurred rather than a more general HARDWARE ERROR,
    allowing for quicker isolation and root cause of the problem.
    
    4) It avoids ULPs resorting to inappropriate error recovery such as CLEAR
    ACA or a BDR on seeing CHECK CONDITION with HARDWARE ERROR.
    
    5) It retains layering semantics and differentiates between LLP service
    delivery errors and ULP SCSI errors.
    
    I'd be happy to learn what implications of this issue I've missed.
    
    Thanks & Regards,
    Santosh Rao
    
    
    
    
    
    
    > Should target errors be reported in the service-response?   Except for
    task
    > management that is common to all protocols I did not see any other thing
    > popping up in any SCSI driver.
    >
    > Preaching layering won't make the issue disappear.
    >
    > Julo
    >
    > Stephen Bailey <steph@cs.uchicago.edu> on 14/01/2001 17:10:56
    >
    > Please respond to Stephen Bailey <steph@cs.uchicago.edu>
    >
    > To:   ips@ece.cmu.edu
    > cc:
    > Subject:  Re: iSCSI : Initiators expected to fake CHECK CONDITIONS.
    >
    >
    >
    >
    > Julian,
    >
    > This seems like the zillionth time aired this same disagreement.
    >
    > I think we should try to reach a WG consensus on whether iSCSI should
    > use SCSI status as a means for reporting protocol-related errors, and
    > kill it once and for all.
    >
    > I'm strongly against it.
    >
    > > And an error in the iSCSI layer gets reported by the next layer - that
    is
    > > the regular layering technique (and BTW I am getting a bit uneasy about
    > all
    > > this preaching on layering when it is not obvious that you understand
    all
    > > the implications of the point).
    >
    > Santosh seems to understand 100%.  I agree with Santosh.
    >
    > SAM defines three pieces of status returned by Execute Command()
    > (which is, in turn implemented by each SCSI protocol, e.g. iSCSI):
    >   1) Service Response: task complete, linked command complete, service
    >      delivery or target failure.
    >   2) SCSI status byte
    >   3) SCSI sense data
    >
    > SAM clearly suggest that a protocol's means for signalling
    > protocol-detected errors is the service response status when it says:
    >
    >   The actual protocol events corresponding to a response of TASK
    >   COMPLETE, LINKED COMMAND COMPLETE or SERVICE DELIVERY OR TARGET
    >   FAILURE shall be specified in each protocol standard.
    >
    > Note that defining protocol error events in terms of TC, LCC, SDF and
    > TF, remains an abstraction.  An actual implementation can chose to
    > signal these events by whatever means.  All SCSI implementations I've
    > seen do have a status return component which exactly corresponds to
    > SAM's service response status, but has more than just these four
    > alternatives.  Typically, that includes things like success, command
    > timeout, addressing failure (selection timeout in ||SCSI, bad AL_PA in
    > FCAL, etc.), command aborted, bus parity error, etc..  What iSCSI does
    > need to do is clearly define its error events AS protocol events,
    > which is what describing them in terms of the SAM specified set does.
    >
    > ||SCSI, FCP and the other SCSI protocol standards do this.
    >
    > Operationally, this means that in iSCSI:
    >
    >   o protocol-specific errors for a task detected by the target without
    >     a CLEARLY corresponding SCSI error return should be signalled
    >     using the iSCSI response mechanism.
    >
    >     The initiator will handle these errors by recording them in some
    >     appropriate way, and selecting an appropriate service response
    >     status value.
    >
    >   o errors for a task detected directly by the initiator are handled
    >     by recording them in some appropriate way, and selecting an
    >     appropriate service response status value.
    >
    > I don't know that there's a single sentence or section in SAM which
    > says this, but it clearly implies that the components of SCSI status
    > (status byte and sense) are data which are CARRIED (not created) by
    > SCSI protocols, for the use of the protocol-independent components.
    > For example, a disk peripheral driver reacts to SCSI status and sense
    > generated by disks for SCSI operations that it starts.
    >
    > SCSI status should be equivalent to the SCSI status returned by the
    > logical unit.
    >
    > SCSI protocols are not citizens of the SCSI status space, which means
    > that the real citizens (the command standards, and SAM), may define
    > semantics of SCSI error codes which conflict with iSCSI's selections.
    > The `hardware error' sense key might be defined to mean something very
    > specific within a particular SCSI peripheral command set, and your
    > choice of synthesizing SCSI status within the protocol could cause
    > this behavior to misfire.
    >
    > > [js} the error was generated by a faulty controller and I did not find
    > > any other SCSI sense fit for it[/js]
    >
    > That's because there IS no SCSI sense fit for it.  It is a
    > protocol-detected, protocol-unique error.  Therefore it should be
    > signalled using the service response status.
    >
    > Steph
    >
    >
    >
    >
    
    
    --
    #################################
    Santosh Rao
    Software Design Engineer,
    HP, Cupertino.
    email : santoshr@cup.hp.com
    Phone : 408-447-3751
    #################################
    
    
    
    


Home

Last updated: Tue Sep 04 01:05:51 2001
6315 messages in chronological order