SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: iSCSI: Keep alive



    John,
    
    I do not agree with your assessment for the need of establishing a
    connection failure timeout.  To achieve a desired detection interval within
    the tens of seconds, a mechanism must be in place.  Contemplating dropping,
    suspicious, beyond some timeout value fails to provide a deterministic means
    of assessing the status of the connection.  I agree there is no point in
    pinging every 10 seconds under the following conditions:
    	1) There is already communication traffic confirming connection.
    	2) The connection is idle and no status is pending.
    
    This makes a SCSI ping every ten seconds a rare event.  It does however
    allow expeditious detection of a failure within a time interval suitable for
    preventing overlapping retry mechanisms.  As a SCSI ping will be a rare
    event used only during periods of very low utilization, requiring the ping
    to be serviced by the task process improves the reliability of failure
    detection.
    
    Doug
    
    > Team,
    > Let me see if I can now boil down the thoughts that have occurred on the
    > keep-alive thread and some of my input:
    >
    > It has been stated that there may be value in ensuring that a
    > link has gone
    > down and re-establish it, without the SCSI application being aware.
    > Assuming that this is valid:
    >
    > 1. If for what ever reason an iSCSI session thinks something has
    > gone wrong
    > and is contemplating dropping the connection it should Ping
    > before dropping
    > a session that maybe down, as a possible confirmation that the link is
    > down.  (This is not 100% guarantee that this will always detect that the
    > connection is still active, but if responded too, that will guarantee that
    > the connection is still up.)
    > 2. No point to pinging every 10 Seconds or so.  If nothing is outstanding,
    > or missing, then why Ping?  Perhaps the implementation can have time out
    > values  etc. that it can use to determine if a suspicious connection is
    > still up.  This seems to be an implementation issue.  I think that an
    > implementer note in the draft would be sufficient.  Something like "The
    > implementation may consider the discovery of dropped connections by use of
    > a Ping, at the point the implementation is suspicious of failure. (should
    > not be done regularly).  A suspicion maybe raised by an outstanding
    > expected response, beyond some Time-out value."
    > 3. We do not need to find out if no one is home at the SCSI task layer,
    > that is the job of the SCSI Task layer.  We just need to ensure that the
    > iSCSI transport is OK.
    > 4. It has been pointed out that, the ping can be returned by the HW,
    > without respect to things going on higher in the various layers, but that
    > with SW implementations, it may be blocked behind other unprocessed stuff.
    > This also depends on the implementation of the SW, and the buffer
    > handling.
    > So I would suggest, that the very most a ping can help with, in the SW
    > implementation, is a premature line drop -- by the pinging side -- that
    > would have otherwise occurred without the ping.
    > 5.The ping can be useful in sorting out the difference between a long SCSI
    > operation and a connection hang. That is, if it is a long SCSI response
    > time, the ping will return, and the connection dropping can be avoided.
    > 6. The only time the adequacy of the above approach  is an issue, is when
    > stuff has backed up in the iSCSI buffers and SW TCP/IP buffers --
    > undelivered to SCSI -- for such a long time that  it is noticed on the
    > other end.  We have other flow control things to control this problem.
    > Therefore, a ping at (and only at) the time of suspicion and to avoid a
    > inappropriate connection drop is a valid approach.
    >
    >
    > .
    > .
    > .
    > John L. Hufferd
    > Senior Technical Staff Member (STSM)
    > IBM/SSG San Jose Ca
    > (408) 256-0403, Tie: 276-0403
    > Internet address: hufferd@us.ibm.com
    >
    
    

    • References:


Home

Last updated: Tue Sep 04 01:06:31 2001
6315 messages in chronological order