SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Comments on draft-satran-iscsi-01.txt



    All:
    
    Please find enclosed my comments on the latest draft of iSCSI
    (draft-satran-iscsi-01.txt).  The first section has comments 
    and the second section has additional enhancements I am requesting.
    
    Thanks.
    --
    Mallikarjun 
    cbm@rose.hp.com
    Networked Storage Architecture
    HP Storage Organization, Roseville.
    
    
    
    
    
    Comments on draft-satran-iscsi-01.txt
    -------------------------------------
    
    o Section 2.1, page 4. Paraphrases SAM-2 by defining a task as a "a linked 
      set of SCSI commands".  Suggest rewording as "a SCSI command, or possibly 
      a linked set of SCSI commands".  Wherever applicable, the iSCSI draft should 
      also cite the corresponding SAM-2 clause numbers.
    
    o Section 2.2.2 on page 5. First sentence states "iSCSI supports ordered 
      command delivery within a session.".  While I realize that ordered completion 
      is not required by the iSCSI spec, I suggest rewording as "iSCSI supports 
      ordered task initiation and completion within a session".
    
    o I have a general question about all iSCSI PDUs from the initiator to the
      target carrying the CmdRN.  Section 3.10.5 seems to leave it as an 
      implementation choice, but why is it necessary?  Once we stipulated 
      task allegiance to a TCP connection and given that the SCSI Data PDU 
      has the `Buffer Offset' field in it, why would the CmdRN be necessary
      for iSCSI Data PDUs from the initiator to the target?  
    
    o Section 2.2.2 on page 5.  Towards the end of the second para, "The target
      will reject any command outside this range or...." should be reworded 
      to make it obvious as "The target will ignore any command outside this 
      range or...".   We do not want network bandwidth to be consumed for 
      something that cannot be valid.
    
    o Section 2.2.2 on page 5.  I suggest that the spec should spell out all 
      the conditions under which the CmdRN is reset back to the initial value -
      target reset (currenly forces session termination, but even if doesn't
      in future), lun reset, and re-establishment of session.  
    
    o Section 2.2.2 on page 6, second para. There's a sentence - "iSCSI initiators 
      are required to implement the numbering scheme if they support more than 
      one connection."  Suggest adding "per session" at the end of the sentence,
      as per current status of the spec.  If and when session recovery mechanisms 
      are defined, this statement may have to be modified to perhaps even mandate 
      numbering with one TCP connection, and across sessions.
    
    o Section 2.2.2 on page 6, third para -  "iSCSI targets are not required to 
      use the numbering scheme for ordered delivery".  I take it that the iSCSI
      protocol layer which provides the service delivery port abstraction to 
      the device server is not required to deliver commands in the CmdRN order.
      It appears though as if the target iSCSI layer shall always keep track of 
      the ExpCmdRN.  This should be stated explicitly.  Also, it appears
      that the StatRN may not be valid coming from the targets which do not
      re-order received commands by CmdRN.  It would help to state this as well.
    
    o This capability of a target to enforce the numbering scheme in full is 
      something the initiators would be better off knowing.  I suggest adding 
      a Login/Text key for this.
    		  EnforceOrdering: <yes | no>
    		  --> EnforceOrdering: <yes | no>
    
    o Section 2.2.4 on page 7, first para.  Contains a discussion about 
      connection allegiance, which limits to commands.  I suggest tasks be
      used instead.  I would also propose that the abort task management 
      request should also have the same connection allegiance as the original 
      command.
    
    o Section 3.2.1 on page 14.  Autosense is made optional through a bit
      setting in the SCSI Command PDU.  Why can't it be made mandatory?  
      That would make the life of the FC-iSCSI bridges a lot more easier,
      since FC mandates it.  Also, section 5.3 discusses Autosense from the
      device perspective, but stops short of saying that Autosense is mandatory
      for all iSCSI targets.  Are there any device issues complicating the
      situation here?  It appears that compliance with FC has been fairly
      in place.
    
    o Section 3.3.4 on page 17.  Suggest adding an additional iSCSI Status 
      value in the SCSI Response PDU -  "2  Non-existent iSCSI session".  
      This shall be returned if a SCSI command were attempted without an 
      established iSCSI login session.  Returning a Logout PDU is another
      option (see the enhancement proposal below).
    
    o Section 3.3.7 on page 17.  It appears to be in need of more definition 
      about Response data.  I propose the following response data values -
    	 - RTT-related: the data sent did not match the burst size, or the
    			offset allowed in the RTT message.
    
    	 - SCSI Command format: invalid command format.
    
    o Section 3.3.7 on page 17.  It appears that there's a violation of 
      SAM-2 transport and application protocol layering here.  The discussion
      allows certain iSCSI (transport) errors to be indicated when the 
      Command Status (SCSI application protocol) is CHECK CONDITION.  It 
      would seem that all such transport errors should be able to be flagged 
      with a non-zero iSCSI Status alone.  The initiators would in that case are 
      expected to look at the iSCSI status first, and proceed further only
      if it is zero.
    
    o Sections 3.4 and 3.5 on pages 19 and 20.  Suggest adding one sentence
      each to NOP-OUT and NOP-IN sections to explicitly state the direction 
      of flow of the PDU.
    
    o Section 3.7.1 on page 23.  For the Target Reset task management 
      function, the target is not expected to provide a response - and
      this is concerning to me.  
           - how would the initiator confirm the successful completion of
    	 the target reset?
    	
           - not having a response effectively makes the target reset
    	 an operation iSCSI hardware cannot assist.  Having a response 
             would make it no different from the rest of the iSCSI transactions 
    	 and the hardware can gracefully deal with it.
    
           - SAM-2 (section 6.6, page 63) specifies what a target should
    	 do "Before returning a FUNCTION COMPLETE response".  This seems
    	 to me as an implicit requirement that a response be returned
             on a Target Reset task management function.
    
           - I am also unclear as to why the sessions are allowed to be 
    	 terminated.  In the FC world, as far as I can recall, the 
    	 process login sessions remain intact after a target reset.
    	 If it is required that the sessions and the associated TCP
    	 connections be cleared in iSCSI, it is helpful to mandate (as 
             opposed to leaving it up to the implementation) that an Async 
             event shall be reported to all the initiators currently logged in.
    
    o Section 3.8 on page 25.  Suggest additional SCSI Task Management Response 
      indications in addition to the two defined.
    	  2   Function Invalid (the function is invalid as per current rev)
    	  3   Function Unsupported (valid, but implementation doesn't support)
    
      Also suggest adding iSCSI Status and Response data fields to the PDU.
    	- iSCSI Status can take three values: success, non-existent
    	  LUN, and non-existent iSCSI session.
            - Response data can take one value: Invalid message format.
    
    o Section 3.10.2 on page 29.  The Transfer Tag is the Initiator Task 
      Tag in SCSI Data PDU from target to initiator.  Why then are both
      shown in the payload diagram?  If this is done to retain some resemblance
      between the two types of SCSI Data PDUs, I would suggest that ideally
      there be one type of SCSI Data PDU for both READ and WRITE - with 
      certain fields in the PDU to be ignored in each case.  This makes it
      easier on hardware and software implementations.
    
    o Section 3.11 on page 31.  Suggest adding statement "The capabilities
      exchanged and operations performed are valid for the entire login 
      session including all TCP connections for that session."  This makes 
      it clear, for ex., as in the case of: should authentication be performed 
      on every new connection, or only on the leading connection. 
    
    o Since Text key pairs are used as part of Login process and outside,
      I advocate that they be named thus - "Text key pairs".  This terminology
      can consistently be used across.  The current usage has "Text Command
      format" (in section 3.13), and "Login/Text keys" elsewhere (section 10).
    
    o I suggest that Login dialogue should also be able to identify the 
      alternate names the same target (task manager and the device servers,
      albeit possibly with different target "id"s) is available from.  
      This would mean including more Text key pairs in Appendix B.  
      A possible format is - 
    	       OtherNames: <Descriptor type,Descriptor value>
    
    	       where the Descriptor type is as defined in section 3.17.1
    	       and the Descriptor value is based on the given type.
    
    o Section 3.14 on page 39.  The first paragraph allows a target to
      respond to a Login Command with a Login Response and an unsolicited
      Text Response PDU.  I suggest that there be a "Login reply" bit in 
      the Text Response PDU to indicate to the initiator that the PDU is
      not in response to a Text Command, but as a reply to a login proposal.
    
    o Section 3.14.1 on page 39.  States that the "InitStatRN is significant
      only if TSID is 0".  I am somewhat confused by this.  It appears that
      it should be non-zero.  I am assuming that the TSID is non-zero in the
      first Login Response on a connection for a given session (leading connection), 
      and is zero in all subsequent Login Responses on other connections of 
      the same session.  If this were true, only the leading connection's login
      dialogue should specify InitStatRN.  
      [ I see now that Mike from IBM already pointed this out, but am keeping 
        this to confirm the expected target behavior that I described. ]
    
    o Section 3.18 on page 46.  Suggest adding comments to explicitly state
      that target should reject (with a non-zero Response) the Map Command
      when a map (or unmap) of a particular TAN (or SRA) fails out of a set
      of descriptors.  Essentially, either all descriptors succeed, or all
      end up in a failure.  [ I see that Bill Main also had a comment on the
      same issue. ]
    
    o Section 4.1 on page 49.  Second paragraph from the bottom. "if they
      are not acknowledged yet or a new CmdRN if they where acknowledged;" 
      should be "if they are not acknowledged yet or a new CmdRN if they 
      were acknowledged".
    
    o Section 9.2 on page 59.  The Write operation example depicts multiple
      SCSI Data PDUs being shipped to the target in response to one RTT PDU.
      This effectively puts a requirement on the target implementations to 
      keep the target transfer tag valid until the expected data size is 
      received.  It would be helpful to explicitly state this in section 3.9.
    
    
    New proposals for enhancements
    ------------------------------
    
    o I propose a new opcode to do Third party Logout - to provide a service 
      with the (almost) same name in Fibre Channel (Third Party Process Logout
      TPRLO).  This is issued by an initiator, and it requests the target to 
      logout with all third party initiators who are iSCSI logged in with the 
      given target.  This should be effective on all the initiators having iSCSI
      access to the same set of task manager and device servers. This feature 
      allows one host (initiator) to ensure that there there are no other hosts 
      talking to a target device, in failover configurations.  In order to 
      ensure that this is not maliciously used by rogue hosts, iSCSI target may 
      selectively allow initiators to do TPRLO with a TPRLO-password to be
      specified in the TPRLO command PDU, this password communicated in the 
      login dialogue.
    
    o I propose adding a set of two new iSCSI opcodes for "Logout" and "Logout 
      Response". 
                - Logout and Logout Response enable the dual roles of initiator
                  and target to be independently played by SCSI devices.
                - these can also be used as an error recovery mechanism by a 
                  target, forcing the initiator to re-login.
                - these also enable multiple sessions to be operational across 
                  a pair of SCSI devices (if and when we want it).
    
      This Logout is _across_ all channels associated with the iSCSI Session, 
      and a graceful connection termination using TCP FINs for all the individual 
      TCP connections (other than the one Logout is delivered on) is recommended 
      before this Logout command.
    
    o Irrespective of the exact process to handle an error, I propose that
      an upper bound be specified on the time that a SCSI device (initiator
      or target) should wait before freeing up resources allocated for a SCSI
      task (and thus assume the implicit termination of the SCSI task).  This
      timeout shall only be used in the case of a continued failure to re-establish
      the Login session for the said period.  I know that this is a passionate
      topic, but even a ridiculously high upper bound (say 4 hours for task),
      varying on device class (disk, tape ..) is better than no upper bound.
      This is the only architected way for certain hosts (say rebooted on an 
      average, once an year) to recover task resources like Tags.  Note that
      this is not an attempt to impose new timers on an iSCSI implementation,
      this only requires that the resources must be set aside for this much time -
      one could envision an implementation which would reallocate resources
      only as needed with no timers, so the resources can be set aside far
      longer than the iSCSI spec requires.
    
    o I propose that a new iSCSI PDU "SCSI Conf" (analogous to FCP_CONF of FC) 
      be defined as a payload sent from an initiator to a target.  This informs 
      the target that the initiator received the SCSI Response message on that
      Initiator Task Tag.  A target should wait to execute the subsequent commands 
      on an error until this SCSI Confirmation PDU is received.  This handshake
      "allows subsequent queued stateful operations to be performed" (taken out 
      of FCP-2 spec).  In the context of tapes/asynch mirroring, this preserves 
      ordering/coherency since the target device stalls for the SCSI Conf from 
      the initiator.  To avoid unnecessary overhead, a target would request the 
      SCSI Conf message in the SCSI Response message only on a SCSI task ending
      in an error.  Also, SCSI Conf should also play by the rules of the connection
      allegiance.
    
    o I suggest the following new Login/Text keys in section 10 -
    
             - Following are for capabilities.
                 InitiatorCapability: <yes|no>
                 TargetCapability: <yes|no>
    
             - Following for protocol revision.
                 iSCSIRevision: <X.Y decimal> 
    
               The responder to a Login command may choose to propose an 
               equal or a lower rev than proposed in the Login command payload.  
               If the counter-proposal in the response is not acceptable, 
               the sender of Login command should immediately log out.  If the
               responder can only support higher revs, login is rejected.
    
             - Following, subsequent to the SCSI Conf enhancement request.
                 SCSIConfSupport: <yes | no>
                 --> SCSIConfSupport: <yes | no>
    


Home

Last updated: Tue Sep 04 01:08:05 2001
6315 messages in chronological order