SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: questions about FCIP connection failure detection


    • To: "Fraser, Don" <Don.Fraser@compaq.com>
    • Subject: RE: questions about FCIP connection failure detection
    • From: "Chong Peng" <ChongPeng@MaXXan.com>
    • Date: Wed, 1 May 2002 09:52:49 -0700
    • Cc: <ips@ece.cmu.edu>
    • Content-Class: urn:content-classes:message
    • Content-Transfer-Encoding: quoted-printable
    • Content-Type: text/plain; charset="iso-8859-1"
    • Sender: owner-ips@ece.cmu.edu
    • thread-index: AcHsmcmVHRllCleqEdaUgACw0PIHqQAlDA9AABJpNrAAvNTZEAAxDzyw
    • Thread-Topic: questions about FCIP connection failure detection

    Don:
    
    My thought is that "there are several other link outage detection techniques" might cause 
    interoperability problem. Are those link outage detection techniques interoperable? Because 
    if they are not, and the FCIP spec does not specify which should be used, we will have
    interoperability problem.
    
    Chong Peng
    
    -----Original Message-----
    From: Fraser, Don [mailto:Don.Fraser@COMPAQ.com]
    Sent: Tuesday, April 30, 2002 11:17 AM
    To: Chong Peng; ips@ece.cmu.edu
    Subject: RE: questions about FCIP connection failure detection
    
    
    Hi;
    
    I don't think you missed anything in your understanding of the FCIP keep-alive timer.  And yes it is true that for it to work, both sides must use the same messages.  
    
    Please remember that there are several other link outage detection techniques intended to detect outages much faster than 2 hours.  All of these are listed in the table at the very end of the FCIP draft.  Then upon detecting an outage of the TCP connect, the FCIP entity is to report those to the FC entity which in turn informs the FC fabric of the FCIP link outage.  In addition at the Fibre Channel fabric level, one should also expect the basic Fibre Channel hello protocol to also periodically test the status of the same TCP connection but between FCIP switching elements, not at the TCP stack level.
    
    Don
    
    -----Original Message-----
    From: Chong Peng [mailto:ChongPeng@MaXXan.com]
    Sent: Friday, April 26, 2002 5:25 PM
    To: Fraser, Don; ips@ece.cmu.edu
    Subject: RE: questions about FCIP connection failure detection
    
    
    Don:
    
    Thanks for the explaination. But I do have another question.
    
    Here is my understanding:
    
    A TCP connection can fail in two different situations.
    
    (1) TCP connection fails when data flows across it.
    (2) TCP connection fails when there is no data flows across it. For example, 
        the one end of a TCP connection crashes/reboot while no data exchanged 
        across the TCP connection.
    
    Failure (1) is relatively easy to detect. For example, after TCP does 
    re-transmit for a few times, it will send a RST. So, eventually,
    both ends of the TCP connection will notice the failure.
    
    Failure (2) is relatively hard to handle. When one end get rebooted, there 
    is a possiblity that the other end never notice the failure. This is especially 
    true when the end get rebooted is the TCP client, because usually, when
    the TCP clients do not send service requests to the TCP servers, the TCP 
    servers would not send anything to the TCP clients. That is why TCP keep-alive 
    timer, although not defined in RFC 793, come into the place in some of the 
    TCP implementations. I believe the purpose of the TCP keep-alive timer is to 
    guranteer that both ends of the TCP connection eventually detect failure (2), 
    enev though it is after a long time (max two hours).
    
    Now look at the TCP failures in the context of FC over TCPIP. The first paragraph 
    of Section 9.4 in FCIP spec basically says that, in order to detect failure (2) in
    FC over TCPIP, means other than TCP keep-alive timer is needed because two hours 
    is too long. And the spec then suggests that "In order to facilitate faster 
    detection of loss of connectivity, FC Entities SHOULD implement some form of 
    Fibre Channel connection failure detection (see FC-BB-2 [4])". Here, my 
    understanding is that the spec suggests some sort of "keep-alive like" scheme can be 
    implemented in the FC entity. The question is: how can we keep the interoperability 
    among FCIP devices from different vendors if we let vendors to implement their own 
    "keep-alive like" scheme in the FC entity? My understanding is that any 
    "keep-alive like" scheme involves message exchanges between two ends, in other 
    word, for any "keep-alive like" scheme to work properly, both ends of the 
    connection have to talk the same language.
    
    Do I understand this wrong or miss something here?
    
    chong peng
    
    -----Original Message-----
    From: Fraser, Don [mailto:Don.Fraser@compaq.com]
    Sent: Friday, April 26, 2002 7:29 AM
    To: Chong Peng; ips@ece.cmu.edu
    Subject: RE: questions about FCIP connection failure detection
    
    
    Hi:
    
    > In idle mode, a TCP Connection "keep alive" option of TCP is
       normally used to keep a connection alive. However, this timeout is
       fairly large and may prevent early detection of loss of
       connectivity. In order to facilitate faster detection of loss of
       connectivity, FC Entities SHOULD implement some form of Fibre
       Channel connection failure detection (see FC-BB-2 [4]).
    
    This is a not required to implement to pass interoperability with other FCIP gateways devices and is not in error.  A vendor may choose to implement their own keep-alive to be used whenever there is no traffic received for the keep-alive time internal.
    
    > When an FCIP Entity discovers that TCP connectivity has been lost,
       the FCIP Entity SHALL notify the FC Entity of the failure including
       information about the reason for the failure.
    
    On the other hand the FCIP entity being closer to the TCP stack than the FC entity and is therefore able to detect and report the loss of TCP connectivity.  The method of reporting this loss to the FC entity is left up to the implementer.  In a revision of the FC-BB-2 made at the last T11 meeting in Vancouver it was approved to add the following to a new clause in section 16.3:
    
    16.3.x  FCIP Error Reporting
    
    The FC entity will receive notifications from the FCIP entity due to a number of errors detected by the FCIP entity. As a result, the E_Port implementation of the FC entity must report those errors to the local FC switch element via the local VE_port (see Fig 23).  Similarly the B_Port implementation must report the error to the local VB_access port (see figure 26). In addition the FC entity may pass these error reports to the local PMM for inclusion in a local event log.
    
    In both cases, the FC entity shall convert the error message received from the FCIP entity into a Registered Link Incident Report (FC-FS RLIR).  It is the RLIR that is forwarded from the FC entity to either the VE_Port (figure 23) or VB_Access (figure 26).  On receipt of the message from the FC Entity, the VE_Port or VB_Access shall immediately forward the RLIR to the FC Switch Entity.
    
    As a minimum the FC Entity shall accept the following messages from the FCIP entity and shall transfer them as an RLIR to the FC Switching Element by the VE_Port or to the FC Network by the VB_Access:
    	FCIP RFC Section 6.6.2.3: Loss of FC frame synchronization
    	FCIP RFC Section 9.1.2.3: Failure to setup TCP connection
    	FCIP RFC Section 9.1.3: TCP connect request timeout or Duplicate connect request
    	FCIP RFC Section 9.2: Successful completion of FC Entity request to close TCP connection
    	FCIP RFC Section 9.4: Loss of TCP connectivity
    	FCIP RFC Section 10.4.3: Excessive number of dropped datagrams or Any confidentiality 			violations
    	FCIP RFC Section 10.4.4: SA parameter mis-match
    
    Don Fraser
    Contributor to FCIP
    
    -----Original Message-----
    From: Chong Peng [mailto:ChongPeng@MaXXan.com] 
    Sent: Thursday, April 25, 2002 2:48 PM
    To: ips@ece.cmu.edu
    Subject: questions about FCIP connection failure detection
    
    
    Hi, all
    
    The Section 9.4 (TCP Connection Considerations) of draft-ietf-ips-fcovertcpip-09 
    says:
     
       In idle mode, a TCP Connection "keep alive" option of TCP is
       normally used to keep a connection alive. However, this timeout is
       fairly large and may prevent early detection of loss of
       connectivity. In order to facilitate faster detection of loss of
       connectivity, FC Entities SHOULD implement some form of Fibre
       Channel connection failure detection (see FC-BB-2 [4]).
     
       When an FCIP Entity discovers that TCP connectivity has been lost,
       the FCIP Entity SHALL notify the FC Entity of the failure including
       information about the reason for the failure.
    
    I have a couple of questions regarding this section:
    
    1. The first pragraph states that the FC entity is responsable to discover the 
       connection failure. But the second paragraph implys the FCIP entity discovers 
       the connection failure first and then notifies the FC entity. Is there an 
       editorial error?
    2. If we let the application protocol on the top of TCP to discover the 
       connection failure, what scheme are we going to use? Are we planning to
       define some "FCIP keep alive" frames in the future? I checked FC-BB-2,
       in the section related to discovery (13.2.2.4.2), it says "TBD".
    
    Chong Peng
    
    
    This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized use; review, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by return email and destroy all copies of the original message. 
    Copyright © 2002 MaXXan Systems, Inc. All rights reserved.
    
    
    This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized use; review, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by return email and destroy all copies of the original message. 
    Copyright © 2002 MaXXan Systems, Inc. All rights reserved.
    
    
    This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized use; review, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by return email and destroy all copies of the original message. 
    Copyright © 2002 MaXXan Systems, Inc. All rights reserved.
    


Home

Last updated: Wed May 01 15:18:30 2002
9928 messages in chronological order