SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: Keep-alive traffic (was iSCSI: more on StatRN)



    Cheng,
    
    It is not as simple as that.  StatRN does not provide a deterministic
    timeout for a connection failure indication.  As such, it should not be
    relied upon to determine a failed connection.  Even with keepalive, the TCP
    timeout will still be too long for useful connection recovery should status
    be pending.  It would be desirable to prevent a reset and overlapping
    recoveries.  As such, for a connection failure detection to be useful, it
    must be relatively quick.  The SCSI layer should know little about the
    underlying transport so the transport must be pro-active in responding to
    transport failure.  Repetitive probes every 10 seconds could satisfy
    detection requirements while status is pending.  This would also provide the
    target early notice of a client failure as communication while not idle
    would be deterministic.
    
    Doug
    
    
    
    > StatRN and keep-alive are intended for detecting and recovering a lost
    > connection or iSCSI command.  My opinion is they are mandatory only if
    > Internet is a very unreliable connection. In traditional SCSI adapters, a
    > target device never initiate a recovery but must detect
    > duplicated commands.
    > An initiator device always tries its best in detecting an error early and
    > reissuing the command without resort to big-hammer.  Application software
    > and device driver timeouts are imperative because a SCSI device can die
    > without warning.  StatRN and keep-alive are needed when the frequency of
    > losing a connection or command is so high that recovery by timeout is
    > considered undesirable and inefficient.  If the error frequency is so low,
    > then, keep the design simple and stupid by letting device driver
    > timeout do
    > its job.  When a target device is shared by many initiators, kill
    > it with a
    > big-hammer involves everyone sharing the target.  For those who debated
    > forever on hard and soft SCSI resets, they understand the need and
    > consequence of big-hammer.  For an iSCSI adapter, if the frequency of
    > connection and command loss is high, StatRN and keep-alive are useful in
    > helping detect the loss early.
    >
    > To know why a target never initiates a recovery but must detect duplicated
    > commands and an initiator must detect an error early without resort to
    > big-hammer, please make a table of all possible errors and their recovery
    > actions, then, the conclusion is obvious.
    >
    > Y.P. Cheng, CTO, ConnectCom Solutions Corp.
    >
    >
    
    


Home

Last updated: Tue Sep 04 01:06:34 2001
6315 messages in chronological order