SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI - Change Proposal X bit



    Santosh,
    
    There is nothing in a command that arrives late on a link (as in the 
    example in which it was sent redundantly)  to distinguish it from a new 
    (valid) command.
    
    This wraparound problem exists in all protocols - even in TCP, and we use 
    the CmdSN per session in the same fashion TCP uses sequence numbers per 
    connection - and it is solved in different ways (TCP uses time-stamps).
    
    The NOP is meant to solve that wrap-around problem.
    
    I am sure that when rereading the example you will see the issue.
    
    Julo
    
    
    
    
    Santosh Rao <santoshr@cup.hp.com>
    Sent by: santoshr@cup.hp.com
    24-10-01 18:29
    Please respond to Santosh Rao
    
     
            To:     Julian Satran/Haifa/IBM@IBMIL
            cc:     ips@ece.cmu.edu
            Subject:        Re: iSCSI - Change Proposal X bit
    
     
    
    Julian,
    
    Some comments on the below quoted scenarios :
    
    >    session has 3 connections
    >    on connection 1 I->T c1,c2,c3,C6
    >    on connection 2 I->T c4,c5,c7,c8
    >    Target receives 1,2,4,5,7,8 (miss 3 and 6) and acks 1 & 2
    >    Initiator closes 1 and resends c3, c4, c5,c7,c8  on connection 2 and 
    6
    >    on connection 3
    >    target receives all and starts executing and acks 8 on connection 3 
    but
    >    connection 2 stalls after c3 for a  LONG TIME
    >    then (after 2 full sequence wraps) connection 2 is gets alive and
    >    delivers c4,c5 etc (that are now valid)
    
    When the target acks CmdSN 8 on connection 3, it has, in effect, sent
    CmdSN ack's for CMdSNs 3,4,5,6,7,8. This implies that the commands with
    CmdSN 3, 4, 5, 7, & 8 were received by the target on connection 2 and
    their processing was commenced.
    
    Hence, the following does not make sense :
    
    >    connection 2 stalls after c3 for a  LONG TIME
    >    then (after 2 full sequence wraps) connection 2 is gets alive and
    >    delivers c4,c5 etc (that are now valid)
    
    c4, c5, etc were already delivered to the target and are not being
    re-delivered. There is no problem in this case. (??).
    
    Take the next scenario :
    > 2 connections:
    > 
    > connection1  I->T c3,c4,c5
    >       status of 3 contains ack up to 6 and it and all other statuses are
    > lost
    > connection2  resend c3, c4 & c5 (no logout) and those are executed!
    
    Since the initiator got CmdSN ack's upto 6, the initiator should not be
    re-issuing these I/Os ??
    
    I still don't see justification to require that initiators send a
    immediate NOP-OUT in the manner being advocated.
    
    On a more fundamental note, I see some issues with the initiator being
    allowed to re-issue the commands on a different connection without
    having first logged out the previous connection successfully. I see
    nothing in the draft that suggests such behaviour, while at the same
    time, it is not forbidden.
    
    By resorting to command retries on a different connection in an attempt
    to plug the hole, without first logging out the previous connection, the
    initiator is susceptible to encountering I/O failure of that I/O due to
    ULP timeout.
    
    Here's the scenario why such recovery should not be allowed :
    - Initiator sends CmdSN 3 on connection 1.
    - No CmdSN updates for a while and initiator re-sends CmdSn 3 on
    connection 2.
    - At the same time, target has sent CmdSN ack's for CmdSN 3 on
    connection 1.
    
    - Initiator has transferred the command allegiance on its side from
    connection 1 to connection 2 and is attempting the command on connection
    2. However, the command does not go through, since the (ExpCmdSN,
    MaxCmdSN) window has advanced and the trget discards the command.
    
    - Target sends in data and/or R2T and/or status for CmdSN 3 on
    connection 1. Since the initiator is not expecting any traffic for that
    I/O on connection 1, it discards any PDUs received on that connection 1
    for which no I/O state existed.
    
    In the above scenario, initiator will never get a CmdSN ack on
    connection 2 and will never be able to plug the hole despite repeated
    retries, finally, causing a ULP timeout, followed by session recovery.
    
    Given the above scenario, I suggest that the initiator must only
    re-issue commands on the same connection, and can re-issue them on
    another connection only following a successful logout.
    
    Comments ?
    
    Thanks,
    Santosh
    
    
    Julian Satran wrote:
    > 
    > Santosh,
    > 
    > The scenarios I am talking about are all derivatives of an initiator 
    trying
    > to plug-in holes and switching connections.
    > As the initiator does know the "extent" of a hole it can send-out 
    commands
    > that he did not have to.
    > I have sent the attached not to Mallikarjun  a while ago.  I think that
    > there might be many of this kind.  I am also aware that X bit by itself
    > might have some bad scenarios but the new proposal fixes them all.
    > 
    > Julo
    > 
    > _____________________________
    > 
    > Mallikarjun,
    > 
    > Take the following sequence scenario:
    > 
    >    session has 3 connections
    >    on connection 1 I->T c1,c2,c3,C6
    >    on connection 2 I->T c4,c5,c7,c8
    >    Target receives 1,2,4,5,7,8 (miss 3 and 6) and acks 1 & 2
    >    Initiator closes 1 and resends c3, c4, c5,c7,c8  on connection 2 and 
    6
    >    on connection 3
    >    target receives all and starts executing and acks 8 on connection 3 
    but
    >    connection 2 stalls after c3 for a  LONG TIME
    >    then (after 2 full sequence wraps) connection 2 is gets alive and
    >    delivers c4,c5 etc (that are now valid)
    > 
    > That is not a very likely scenario, I admit, but it is possible.
    > With X bit I could not find any such scenario since an X either follows 
    a
    > good one on the same connection or can be safely discarded.
    > I suspect that there are some more scenarios that involve immediate
    > commands or commands that carry their own ack in the status and are 
    acked
    > like:
    > 
    > 2 connections:
    > 
    > connection1  I->T c3,c4,c5
    >       status of 3 contains ack up to 6 and it and all other statuses are
    > lost
    > connection2  resend c3, c4 & c5 (no logout) and those are executed!
    > 
    > I think we can avoid those be requiring a NOP exchange before reissuing 
    a
    > command on a new connection or reissue the command with a task 
    management
    > (that has an implied ordering) but why do it if X is an obvious and safe
    > solution.
    > 
    > Julo
    > 
    > Regards,
    > Julo
    > 
    > 
    >                     "Mallikarjun
    >                     C."                  To:     Julian 
    Satran/Haifa/IBM@IBMIL
    >                     <cbm@rose.hp.c       cc:
    >                     om>                  Subject:     Re: iscsi : X bit 
    in SCSI Command PDU.
    > 
    >                     08-10-01 21:45
    >                     Please respond
    >                     to cbm
    > 
    > 
    > 
    > Julian,
    > 
    > We currently have the following specified in section 2.2.2.1 -
    > 
    > "The target MUST NOT transmit a MaxCmdSN that is more than
    > 2**31 - 1 above the last ExpCmdSN."
    > 
    > It appears to me that the above is sufficient to ward off the
    > accidents of the sort you describe.  Do you think otherwise?
    > --
    > Mallikarjun
    > 
    > Mallikarjun Chadalapaka
    > Networked Storage Architecture
    > Network Storage Solutions Organization
    > MS 5668 Hewlett-Packard, Roseville.
    > cbm@rose.hp.com
    > 
    > Julian Satran wrote:
    > >
    > > Mallikarjun,
    > >
    > > There is at least one theoretical scenario in which an "old" command
    > > may appear in a "new window" and be reinstantiated.
    > > At 10Gbs and several connection that does not take months. With X the
    > > probability is far lower (not 0).   I have no other strong arguments
    > > but I am still  thinking.  Matt Wakeley that insisted on it (against
    > > me) had some other argument that I am trying to find (I am note
    > > remembering).
    > >
    > > Julo
    > >
    > >   "Mallikarjun C."
    > >   <cbm@rose.hp.com>                  To:        Julian
    > >                              Satran/Haifa/IBM@IBMIL
    > >   08-10-01 20:39                     cc:
    > >   Please respond to cbm              Subject:        Re: iscsi : X
    > >                              bit in SCSI Command PDU.
    > >
    > >
    > >
    > > Julian,
    > >
    > > Now that you put me on the spot, :-), my response -
    > >
    > > Santosh argued with me privately that X-bit no longer serves a
    > > useful purpose after the advent of task management commands to
    > > reassign.  My response was that it never was a requirement per se,
    > > but always a "courtesy" extended by the initiator to help the
    > > target.  I also suggested that X-bit may be considered for its
    > > usefulness in debugging.
    > >
    > > He still had some (very reasonable) comments for simplification
    > > - the most appealing of which (to me) was the opportunity to do
    > > away with the X-bit checking for *every* command PDU that the target
    > > has to endure now.
    > >
    > > If I missed a legitimate use of X-bit, please comment. Do you
    > > think it is a protocol requirement per se?  I couldn't justify
    > > to myself so far (except the Login).
    > >
    > > Regards.
    > > --
    > > Mallikarjun
    > >
    > > Mallikarjun Chadalapaka
    > > Networked Storage Architecture
    > > Network Storage Solutions Organization
    > > MS 5668 Hewlett-Packard, Roseville.
    > > cbm@rose.hp.com
    > >
    > >
    > >
    > > Julian Satran wrote:
    > > >
    > > > Santosh,
    > > >
    > > > I am not sure you went through all scenarios. A conversation with
    > > your
    > > > colleague - Mallikarjun - and getting through the state table may go
    > > a
    > > > long way to clarify the need for X.
    > > >
    > > > And I am sure that by now you found yourself several .
    > > >
    > > > Julo
    > > >
    > > >   Santosh Rao
    > > >   <santoshr@cup.hp.com>                   To:        IPS Reflector
    > > >   Sent by: owner-ips@ece.cmu.edu  <ips@ece.cmu.edu>
    > > >                                           cc:
    > > >   06-10-01 01:56                          Subject:        iscsi : X
    > > >   Please respond to Santosh Rao   bit in SCSI Command PDU.
    > > >
    > > >
    > > >
    > > > All,
    > > >
    > > > With the elimination of command relay from iscsi [in the interests
    > > of
    > > > simplification ?], I believe that the X bit in the SCSI Command PDU
    > > > can
    > > > also be removed. As it exists today, the X bit is only being used
    > > for
    > > > command restart, which is at attempt by the initiator to plug a
    > > > potential hole in the CmdSN sequence at the target. It does this on
    > > > failing to get an ExpCmdSN ack for a previously sent command within
    > > > some
    > > > timeout period.
    > > >
    > > > Given the above usage of command restart, no X bit is required to be
    > > > set
    > > > in the SCSI Command PDU when command re-start is done.
    > > >
    > > > Either :
    > > > (a) the target had dropped the command earlier due to a digest
    > > error,
    > > > in
    > > > which case, the command restart plugs the CmdSN hole in the target.
    > > >
    > > > [OR]
    > > >
    > > > (b) the target had received the command and was working on it, when
    > > > the
    > > > initiator timed out too soon and attempted a command restart to plug
    > > > [what it thought was] a possible hole in the CmdSN sequence.
    > > >
    > > > In case (a), no X bit was required, since the target knows nothing
    > > of
    > > > the original command. In case (b), no X bit is required again, since
    > > > the
    > > > (ExpCmdSN, MaxCmdSN) window would have advanced and the target can
    > > > silently discard the received retry and continue working on the
    > > > original
    > > > command received.
    > > >
    > > > Removal of the X bit in the SCSI Command PDU has the following
    > > > benefits
    > > > :
    > > >
    > > > a) The CmdSN rules at the target are simplified. No need to look at
    > > X
    > > > bit, only validate received CmdSN with (ExpCmdSN, MaxCmdSN) window.
    > > >
    > > > b) The reject reason code "command already in progress" can be
    > > > removed.
    > > > There's no need for this reject reason code anymore, since X bit
    > > > itself
    > > > is not required, and the targets can silently discard commands
    > > outside
    > > > the command window and continue to work on the original instance of
    > > > the
    > > > command already being processed at the target.
    > > >
    > > > c) Less work for the target and less resources consumed since it no
    > > > longer needs to generate a Reject PDU of type "command in progress".
    > > > It
    > > > can just silently discard any command PDU outside the (ExpCmdSN,
    > > > MaxCmdSN) window.
    > > >
    > > > d) Less code for the target, since it does not need :
    > > > - any Reject code paths when it receives X bit command PDUs that are
    > > > already in progress.
    > > > - No special casing of CmdSN checking rules.
    > > > - No overheads of verifying a received command based on its
    > > initiator
    > > > task tag, to check if the task is currently active, prior to sending
    > > a
    > > > Reject response with "command in progress".
    > > >
    > > > Comments ?
    > > >
    > > > Thanks,
    > > > Santosh
    > > >
    > > > --
    > > > ##################################
    > > > Santosh Rao
    > > > Software Design Engineer,
    > > > HP-UX iSCSI Driver Team,
    > > > Hewlett Packard, Cupertino.
    > > > email : santoshr@cup.hp.com
    > > > Phone : 408-447-3751
    > > > ##################################
    > 
    > 
    >                     Santosh Rao
    >                     <santoshr@cup.       To:     IPS Reflector 
    <ips@ece.cmu.edu>
    >                     hp.com>              cc:
    >                     Sent by:             Subject:     Re: iSCSI - Change 
    Proposal X bit
    >                     owner-ips@ece.
    >                     cmu.edu
    > 
    > 
    >                     23-10-01 22:50
    >                     Please respond
    >                     to Santosh Rao
    > 
    > 
    > 
    > Julian Satran wrote:
    > >
    > > However in order to drop "old" commands that might in the pipe on a
    > > sluggish connection - removing the X bit will require the initiator to
    > > issue an immediate NOP requiring a NOP response on every open 
    connection
    > > whenever CmdSN wraps around (becomes equal to InitCmdSN).
    > 
    > Julian,
    > 
    > Can you please explain further the corner case you are describing above
    > ? Are you suggesting that special action should be taken every time
    > CmdSN wraps around, in case there were holes in the CmdSN sequence at
    > the wrap time ? Why is that ?
    > 
    > Here's my understanding of how this plays out :
    > 
    > Rule 1)
    > The CmdSN management rules at the target should be handling CmdSN wrap
    > case and the initiator cannot issue more than 2^32 -1 commands beyond
    > the last ExpCmdSN update it has received from the target, since the
    > target MUST NOT transmit a MaxCmdSN that is more than 2**31 - 1 above
    > the last ExpCmdSN. (per Section 2.2.2.1)
    > 
    > Rule 2)
    > Any holes that occur in the CmdSN sequence are attempted to be plugged
    > by the initiator by re-issuing the original command. If the CmdSN never
    > got acknowledged and the I/O's ULP timeout expired, the initiator MUST
    > perform session recovery. (per Section 8.6)
    > 
    > Thus, going by the above 2 rules, if the CmdSN sequence wraps upto
    > ExpCmdSN, the initiator will not be able to issue further commands,
    > since the target will keep the CmdSN window closed. The window can only
    > re-open when the CmdSN holes are plugged allowing ExpCmdSN and thereby,
    > MaxCmdSN to advance.  (rule 1 above).
    > 
    > Under the above circumstances, the initiator will possibly try to plug
    > the CmdSN hole by re-issuing the original command. It may do this 1 or
    > more times before its ULP timeout expires. Either the holes get plugged
    > and the windoe re-opens, or ULP timeout occurs without the corresponding
    > CmdSN for that I/O having been acknowledged, resulting in session
    > logout. (rule 2 above).
    > 
    > What is required over and beyond the above ? Why does removal of X-bit
    > require an immediate NOP to be issued every time CmdSN wraps and a hole
    > exists in the CmdSN sequence (??).
    > 
    > Regards,
    > Santosh
    > 
    > --
    > ##################################
    > Santosh Rao
    > Software Design Engineer,
    > HP-UX iSCSI Driver Team,
    > Hewlett Packard, Cupertino.
    > email : santoshr@cup.hp.com
    > Phone : 408-447-3751
    > ##################################
    
    -- 
    ##################################
    Santosh Rao
    Software Design Engineer,
    HP-UX iSCSI Driver Team,
    Hewlett Packard, Cupertino.
    email : santoshr@cup.hp.com
    Phone : 408-447-3751
    ##################################
    
    
    
    


Home

Last updated: Thu Oct 25 03:17:45 2001
7380 messages in chronological order