SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: TCP (and SCTP) sucks on high speed networks



    
    
    
    
    What you are describing is the slow start (that is in fact fast!). That
    works up to half the window.
    From there congestion avoidance kicks in and that requires approximately
    one RTT per unit increase in CWND (linear) increase.
    
    With all those caveats I am still looking for a real limitation as CWND can
    grow past the link rate (or at least I can't recall any rule limiting it
    provided that the receiver advertised window is large enough).
    
    However Matt  is right in that those euristics will get us into trouble on
    high-speed links and cheating on the advertised windows is not a good
    solution - what if both sides cheat?
    
    Julo
    
    Thomas Skibo <skibo@juniper.net> on 01/12/2000 21:01:51
    
    Please respond to Thomas Skibo <skibo@juniper.net>
    
    To:   Matt Wakeley <matt_wakeley@agilent.com>
    cc:   end2end-interest@ISI.EDU, ips@ece.cmu.edu
    Subject:  Re: TCP (and SCTP) sucks on high speed networks
    
    
    
    
    
    
    
    Matt Wakeley wrote:
    >
    > Consider a 10Gbs link to a destination half way around the world.  A
    packet
    > drop due to link errors (not congestion or infrastructure products) can
    be
    > expected about every 20 seconds.  However, with a RTT of 100ms (not even
    > across the continent), if a TCP connection is operating at 10Gbs, the
    packet
    > drop (due to link error) will drop the rate to 5Gbs.  It will take 4
    *MINUTES*
    > for TCP to ramp back up to 10Gbps.
    >
    
    
    Four minutes!?  Okay, I'm ready to be shot down but this is
    how I figure it (based upon TCP implementations with which
    I'm familiar):
    
    If a single drop occurs within a round trip, TCP fast retransmit
    will quickly retransmit the missing segment and cut the congestion
    window in half.  So, assuming cwnd goes from exactly the bandwidth
    delay product to half the bandwidth delay product, the congestion
    window is now 62 MB.
    
    To grow cwnd back to 125 MB, it first takes a round-trip time for new
    ACKs to come from the receiver that actually ACK new data (as
    opposed to the duplicate ACKs you'll get for a round-trip time).
    
    Once new ACKs start coming back, you'll increase the congestion
    window by a segment size for each ACK.  Because each ACK acknowledges
    roughly two segments (in the implementations I'm familiar with),
    it'll take about twice as long to grow cwnd by 62 MB  as it
    takes to transmit 62 MB.  That's another 100 ms.  200 ms total
    to get back to "full speed".
    
    Another thing, I think the congestion window is likely to grow
    beyond the bandwidth delay product if you're only getting a
    single drop every 20 seconds (and assuming you've set your send
    buffers to 250 MB).  So, you may never even notice that it got
    cut in half every 20 seconds.
    
    --Skibo  (skibo@juniper.net)
    
    
    
    


Home

Last updated: Tue Sep 04 01:06:12 2001
6315 messages in chronological order