SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: Target Reset handling



    
    
    James,
    
    The intent of the reset is radical - and assumes that something is bad
    at the target. Initiator May (or should?) be advised by AE on this
    happening.
    The TCP sessions are closed to eliminate any zombies (how gracefully it is
    up to the
    implementer - but he should at least make sure that the AE gets to the
    initiators
    if he decides to signal it (although this might not be very useful).
    
    And what you describe at 2 might happen but then some resources are
    overcomitted
    and that is an administrative issue.
    
    I have the feeling that I am not understanding exactly the scenario you are
    concerned
    with.  Are there initiators waiting to be connected before the reset that
    can't get
    connected?   Are the host sockets over committed? In this case you might
    want to have
    the iSCSI drivers have some pre-allocated ranges of sockets.
    
    Regards,
    Julo
    
    "James Smart" <jssmart@nhinternet.com> on 31/08/2000 16:40:19
    
    Please respond to "James Smart" <jssmart@nhinternet.com>
    
    To:   Julian Satran/Haifa/IBM@IBMIL
    cc:   "Stephanie Smart" <jssmart@nhinternet.com>
    Subject:  RE: Target Reset handling
    
    
    
    
    
    Julian,
    
    I don't understand your response....
    
    I guess I'd like to understand the history of why the TCP sessions
    should be closed. I'm going to assume (forgive my lack of TCP
    knowledge) that the TCP sessions will be gracefully shut down, which
    will involve a handshake between the initiator and target.
    
    Anyway - my reason for asking this is for two fold:
    
      a) a significant time delay may occur when tearing down and
         rebuilding the tcp sessions. All the resetting of the sessions
         may be unnecessary. Is the groups simply advocating the need
         to reset these sessions as part of reseting the device ? And
         must this actually affect all sessions (or only those to this
         initiator) ?
    
      b) What happens if the initiator is not able to reinstantiate the
         TCP sessions ? I expect that the number of sessions supported
         per target will be limited. If there are enough initiators
         contending for the device - or if the resources backing the
         sessions get moved to other sessions (from other intiators),
         the initiator may be denied access and all heck may break
         loose in the applications on the initiator.
    
    -- James
    
    PS: I see your reply was not via the ips reflect - if you deem
    appropriate, you may want to forward this back out to the reflector.
    
    
    ==============
    
     I though that iSCSI has now and adequate mechanism that includes
     connection close.
    
     Julo
    
    >
    > In reading the iscsi-01 draft, I was bothered by several things in the
    > handling of Target Reset.
    >
    > a) The lack of at least a basic ACCept on the Target Reset. If the target
    > can send an async event, why not at least notify reception of the
    function
    > ?
    >
    > Given connections with lots of outstanding traffic, I'd see this as a
    more
    > graceful reset procedure. It allows any outstanding i/o that may be
    > completing while the TR is in transit (or queued for processing on the
    > target) to do so, possibly lightening the load of i/o that has to error
    to
    > complete. This would potentially quicken the recovery time post reset. I
    > would expect this to be more important as the "network" gets larger and
    > longer.
    >
    > Note: FCP does support this behavior.
    >
    > b) Why not require async events to all initiators ?
    >
    > The biggest headache with Target Reset is how long it takes for the other
    > initiators to recognize the device has been reset. The 1st new
    > i/o will get
    > a Unit Attention CA, but this status is typically seen only by the SCSI
    > class driver (e.g. disk/tape/etc). Unless instructed by the class driver,
    > the port level driver (e.g. scsi/fc/iscsi hba) will have to timeout the
    > i/o's (if timing was requested) to recover their context. If the class
    > driver does try to tell the port driver, it typically will do so
    > in a crude
    > fashion - issuing abort requests on the i/o's it knows about.
    >
    > Perhaps, if the TCP connections are gracefully shutdown between the
    > initiator and target, the initiator will be to abort the i/o on the
    > connections quickly (in this case, it looks like a pseudo async event).
    > However, if there is no handshaking on the connections, my limited
    > experience with TCP says it takes a long time for the connection to error
    > out and reset. And during this process, we'll be sending i/o
    > abort requests
    > down the terminated-on-one-end connection. All this would make
    > the recovery
    > time on these other intiators very large.
    >
    > Note: this point assumes that if async events are required - they are
    > ack'd.
    >
    > c) Is there something inherent that requires the TCP connections to be
    > terminated ?
    >
    > The TCP connections look very similar to (but not the same as) FCP
    Process
    > logins between the initiator and target. In FCP, the reset did not
    > necessarily disrupt the port or process logins. It only had to affect the
    > FPC/SCSI task manager. (note: a device was free to really reset, thus
    > indeed
    > tearing down the logins - with the FC port machine handling it as
    > an error)
    >
    > What is the background that required the TCP sessions to be broken ?
    >
    > Obviously, if they are not broken, it affects the answers to (a) and (b)
    > above.
    >
    > d) Given the history of long error recovery times in multi-initiator
    > environments in both parallel scsi and fibre channel on BDR's/Target
    > Reset's, any speed up in this area would be advantageous.
    >
    > -- James
    >
    
    
    
    
    


Home

Last updated: Tue Sep 04 01:07:36 2001
6315 messages in chronological order