SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: Comments on draft-satran-iscsi-01.txt



    
    
    
    Mallikarjun and others,
    
    I would like to voice my support for a logout mechanism.  The draft
    advises us to use TCP FINs but a standard mechanism to close
    channels/sessions would be preferable (generating TCP FINs
    is non-trivial in some programming environs) Personally, our experience
    has been that the logout mechanism has been very useful in
    Fibre Channel and would be a good resource deallocation/error recovery
    mechanism in iSCSI.
    
    I had suggested a logout mechanism in iSCSI at the early stages
    of the draft (Feb timeframe), for some reason it was not accepted.
    Perhaps this is a good time for a discussion on the issue.
    
    Prasenjit
    
       Prasenjit Sarkar
       Research Staff Member
       IBM Almaden Research
       San Jose
    
    
    "Mallikarjun C." <cbm@rose.hp.com>@ece.cmu.edu on 07/27/2000 11:23:51 AM
    
    Please respond to cbm@rose.hp.com
    
    Sent by:  owner-ips@ece.cmu.edu
    
    
    To:   ips@ece.cmu.edu
    cc:   randy_haagens@hp.com
    Subject:  Comments on draft-satran-iscsi-01.txt
    
    
    
    All:
    
    Please find enclosed my comments on the latest draft of iSCSI
    (draft-satran-iscsi-01.txt).  The first section has comments
    and the second section has additional enhancements I am requesting.
    
    Thanks.
    --
    Mallikarjun
    cbm@rose.hp.com
    Networked Storage Architecture
    HP Storage Organization, Roseville.
    
    
    
    
    
    Comments on draft-satran-iscsi-01.txt
    -------------------------------------
    
    o Section 2.1, page 4. Paraphrases SAM-2 by defining a task as a "a linked
      set of SCSI commands".  Suggest rewording as "a SCSI command, or possibly
      a linked set of SCSI commands".  Wherever applicable, the iSCSI draft
    should
      also cite the corresponding SAM-2 clause numbers.
    
    o Section 2.2.2 on page 5. First sentence states "iSCSI supports ordered
      command delivery within a session.".  While I realize that ordered
    completion
      is not required by the iSCSI spec, I suggest rewording as "iSCSI supports
      ordered task initiation and completion within a session".
    
    o I have a general question about all iSCSI PDUs from the initiator to the
      target carrying the CmdRN.  Section 3.10.5 seems to leave it as an
      implementation choice, but why is it necessary?  Once we stipulated
      task allegiance to a TCP connection and given that the SCSI Data PDU
      has the `Buffer Offset' field in it, why would the CmdRN be necessary
      for iSCSI Data PDUs from the initiator to the target?
    
    o Section 2.2.2 on page 5.  Towards the end of the second para, "The target
      will reject any command outside this range or...." should be reworded
      to make it obvious as "The target will ignore any command outside this
      range or...".   We do not want network bandwidth to be consumed for
      something that cannot be valid.
    
    o Section 2.2.2 on page 5.  I suggest that the spec should spell out all
      the conditions under which the CmdRN is reset back to the initial value -
      target reset (currenly forces session termination, but even if doesn't
      in future), lun reset, and re-establishment of session.
    
    o Section 2.2.2 on page 6, second para. There's a sentence - "iSCSI
    initiators
      are required to implement the numbering scheme if they support more than
      one connection."  Suggest adding "per session" at the end of the
    sentence,
      as per current status of the spec.  If and when session recovery
    mechanisms
      are defined, this statement may have to be modified to perhaps even
    mandate
      numbering with one TCP connection, and across sessions.
    
    o Section 2.2.2 on page 6, third para -  "iSCSI targets are not required to
      use the numbering scheme for ordered delivery".  I take it that the iSCSI
      protocol layer which provides the service delivery port abstraction to
      the device server is not required to deliver commands in the CmdRN order.
      It appears though as if the target iSCSI layer shall always keep track of
      the ExpCmdRN.  This should be stated explicitly.  Also, it appears
      that the StatRN may not be valid coming from the targets which do not
      re-order received commands by CmdRN.  It would help to state this as
    well.
    
    o This capability of a target to enforce the numbering scheme in full is
      something the initiators would be better off knowing.  I suggest adding
      a Login/Text key for this.
                EnforceOrdering: <yes | no>
                --> EnforceOrdering: <yes | no>
    
    o Section 2.2.4 on page 7, first para.  Contains a discussion about
      connection allegiance, which limits to commands.  I suggest tasks be
      used instead.  I would also propose that the abort task management
      request should also have the same connection allegiance as the original
      command.
    
    o Section 3.2.1 on page 14.  Autosense is made optional through a bit
      setting in the SCSI Command PDU.  Why can't it be made mandatory?
      That would make the life of the FC-iSCSI bridges a lot more easier,
      since FC mandates it.  Also, section 5.3 discusses Autosense from the
      device perspective, but stops short of saying that Autosense is mandatory
      for all iSCSI targets.  Are there any device issues complicating the
      situation here?  It appears that compliance with FC has been fairly
      in place.
    
    o Section 3.3.4 on page 17.  Suggest adding an additional iSCSI Status
      value in the SCSI Response PDU -  "2  Non-existent iSCSI session".
      This shall be returned if a SCSI command were attempted without an
      established iSCSI login session.  Returning a Logout PDU is another
      option (see the enhancement proposal below).
    
    o Section 3.3.7 on page 17.  It appears to be in need of more definition
      about Response data.  I propose the following response data values -
          - RTT-related: the data sent did not match the burst size, or the
                   offset allowed in the RTT message.
    
          - SCSI Command format: invalid command format.
    
    o Section 3.3.7 on page 17.  It appears that there's a violation of
      SAM-2 transport and application protocol layering here.  The discussion
      allows certain iSCSI (transport) errors to be indicated when the
      Command Status (SCSI application protocol) is CHECK CONDITION.  It
      would seem that all such transport errors should be able to be flagged
      with a non-zero iSCSI Status alone.  The initiators would in that case
    are
      expected to look at the iSCSI status first, and proceed further only
      if it is zero.
    
    o Sections 3.4 and 3.5 on pages 19 and 20.  Suggest adding one sentence
      each to NOP-OUT and NOP-IN sections to explicitly state the direction
      of flow of the PDU.
    
    o Section 3.7.1 on page 23.  For the Target Reset task management
      function, the target is not expected to provide a response - and
      this is concerning to me.
           - how would the initiator confirm the successful completion of
          the target reset?
    
           - not having a response effectively makes the target reset
          an operation iSCSI hardware cannot assist.  Having a response
             would make it no different from the rest of the iSCSI transactions
          and the hardware can gracefully deal with it.
    
           - SAM-2 (section 6.6, page 63) specifies what a target should
          do "Before returning a FUNCTION COMPLETE response".  This seems
          to me as an implicit requirement that a response be returned
             on a Target Reset task management function.
    
           - I am also unclear as to why the sessions are allowed to be
          terminated.  In the FC world, as far as I can recall, the
          process login sessions remain intact after a target reset.
          If it is required that the sessions and the associated TCP
          connections be cleared in iSCSI, it is helpful to mandate (as
             opposed to leaving it up to the implementation) that an Async
             event shall be reported to all the initiators currently logged in.
    
    o Section 3.8 on page 25.  Suggest additional SCSI Task Management Response
      indications in addition to the two defined.
           2   Function Invalid (the function is invalid as per current rev)
           3   Function Unsupported (valid, but implementation doesn't support)
    
      Also suggest adding iSCSI Status and Response data fields to the PDU.
         - iSCSI Status can take three values: success, non-existent
           LUN, and non-existent iSCSI session.
            - Response data can take one value: Invalid message format.
    
    o Section 3.10.2 on page 29.  The Transfer Tag is the Initiator Task
      Tag in SCSI Data PDU from target to initiator.  Why then are both
      shown in the payload diagram?  If this is done to retain some resemblance
      between the two types of SCSI Data PDUs, I would suggest that ideally
      there be one type of SCSI Data PDU for both READ and WRITE - with
      certain fields in the PDU to be ignored in each case.  This makes it
      easier on hardware and software implementations.
    
    o Section 3.11 on page 31.  Suggest adding statement "The capabilities
      exchanged and operations performed are valid for the entire login
      session including all TCP connections for that session."  This makes
      it clear, for ex., as in the case of: should authentication be performed
      on every new connection, or only on the leading connection.
    
    o Since Text key pairs are used as part of Login process and outside,
      I advocate that they be named thus - "Text key pairs".  This terminology
      can consistently be used across.  The current usage has "Text Command
      format" (in section 3.13), and "Login/Text keys" elsewhere (section 10).
    
    o I suggest that Login dialogue should also be able to identify the
      alternate names the same target (task manager and the device servers,
      albeit possibly with different target "id"s) is available from.
      This would mean including more Text key pairs in Appendix B.
      A possible format is -
                OtherNames: <Descriptor type,Descriptor value>
    
                where the Descriptor type is as defined in section 3.17.1
                and the Descriptor value is based on the given type.
    
    o Section 3.14 on page 39.  The first paragraph allows a target to
      respond to a Login Command with a Login Response and an unsolicited
      Text Response PDU.  I suggest that there be a "Login reply" bit in
      the Text Response PDU to indicate to the initiator that the PDU is
      not in response to a Text Command, but as a reply to a login proposal.
    
    o Section 3.14.1 on page 39.  States that the "InitStatRN is significant
      only if TSID is 0".  I am somewhat confused by this.  It appears that
      it should be non-zero.  I am assuming that the TSID is non-zero in the
      first Login Response on a connection for a given session (leading
    connection),
      and is zero in all subsequent Login Responses on other connections of
      the same session.  If this were true, only the leading connection's login
      dialogue should specify InitStatRN.
      [ I see now that Mike from IBM already pointed this out, but am keeping
        this to confirm the expected target behavior that I described. ]
    
    o Section 3.18 on page 46.  Suggest adding comments to explicitly state
      that target should reject (with a non-zero Response) the Map Command
      when a map (or unmap) of a particular TAN (or SRA) fails out of a set
      of descriptors.  Essentially, either all descriptors succeed, or all
      end up in a failure.  [ I see that Bill Main also had a comment on the
      same issue. ]
    
    o Section 4.1 on page 49.  Second paragraph from the bottom. "if they
      are not acknowledged yet or a new CmdRN if they where acknowledged;"
      should be "if they are not acknowledged yet or a new CmdRN if they
      were acknowledged".
    
    o Section 9.2 on page 59.  The Write operation example depicts multiple
      SCSI Data PDUs being shipped to the target in response to one RTT PDU.
      This effectively puts a requirement on the target implementations to
      keep the target transfer tag valid until the expected data size is
      received.  It would be helpful to explicitly state this in section 3.9.
    
    
    New proposals for enhancements
    ------------------------------
    
    o I propose a new opcode to do Third party Logout - to provide a service
      with the (almost) same name in Fibre Channel (Third Party Process Logout
      TPRLO).  This is issued by an initiator, and it requests the target to
      logout with all third party initiators who are iSCSI logged in with the
      given target.  This should be effective on all the initiators having
    iSCSI
      access to the same set of task manager and device servers. This feature
      allows one host (initiator) to ensure that there there are no other hosts
      talking to a target device, in failover configurations.  In order to
      ensure that this is not maliciously used by rogue hosts, iSCSI target may
      selectively allow initiators to do TPRLO with a TPRLO-password to be
      specified in the TPRLO command PDU, this password communicated in the
      login dialogue.
    
    o I propose adding a set of two new iSCSI opcodes for "Logout" and "Logout
      Response".
                - Logout and Logout Response enable the dual roles of initiator
                  and target to be independently played by SCSI devices.
                - these can also be used as an error recovery mechanism by a
                  target, forcing the initiator to re-login.
                - these also enable multiple sessions to be operational across
                  a pair of SCSI devices (if and when we want it).
    
      This Logout is _across_ all channels associated with the iSCSI Session,
      and a graceful connection termination using TCP FINs for all the
    individual
      TCP connections (other than the one Logout is delivered on) is
    recommended
      before this Logout command.
    
    o Irrespective of the exact process to handle an error, I propose that
      an upper bound be specified on the time that a SCSI device (initiator
      or target) should wait before freeing up resources allocated for a SCSI
      task (and thus assume the implicit termination of the SCSI task).  This
      timeout shall only be used in the case of a continued failure to
    re-establish
      the Login session for the said period.  I know that this is a passionate
      topic, but even a ridiculously high upper bound (say 4 hours for task),
      varying on device class (disk, tape ..) is better than no upper bound.
      This is the only architected way for certain hosts (say rebooted on an
      average, once an year) to recover task resources like Tags.  Note that
      this is not an attempt to impose new timers on an iSCSI implementation,
      this only requires that the resources must be set aside for this much
    time -
      one could envision an implementation which would reallocate resources
      only as needed with no timers, so the resources can be set aside far
      longer than the iSCSI spec requires.
    
    o I propose that a new iSCSI PDU "SCSI Conf" (analogous to FCP_CONF of FC)
      be defined as a payload sent from an initiator to a target.  This informs
      the target that the initiator received the SCSI Response message on that
      Initiator Task Tag.  A target should wait to execute the subsequent
    commands
      on an error until this SCSI Confirmation PDU is received.  This handshake
      "allows subsequent queued stateful operations to be performed" (taken out
      of FCP-2 spec).  In the context of tapes/asynch mirroring, this preserves
      ordering/coherency since the target device stalls for the SCSI Conf from
      the initiator.  To avoid unnecessary overhead, a target would request the
      SCSI Conf message in the SCSI Response message only on a SCSI task ending
      in an error.  Also, SCSI Conf should also play by the rules of the
    connection
      allegiance.
    
    o I suggest the following new Login/Text keys in section 10 -
    
             - Following are for capabilities.
                 InitiatorCapability: <yes|no>
                 TargetCapability: <yes|no>
    
             - Following for protocol revision.
                 iSCSIRevision: <X.Y decimal>
    
               The responder to a Login command may choose to propose an
               equal or a lower rev than proposed in the Login command payload.
               If the counter-proposal in the response is not acceptable,
               the sender of Login command should immediately log out.  If the
               responder can only support higher revs, login is rejected.
    
             - Following, subsequent to the SCSI Conf enhancement request.
                 SCSIConfSupport: <yes | no>
                 --> SCSIConfSupport: <yes | no>
    
    
    
    


Home

Last updated: Tue Sep 04 01:08:04 2001
6315 messages in chronological order