SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: Multiple TCP connections



    [This mailing list is acting up so forgive me if this is a repeat]
    
    Randy provides a good summary of the "design team's" decision of why
    they thought multiple connections per session.
    
    My contention is that the argument is inverted that you need multiple
    paths through the fabric to get performance and current link layer
    technology (802.3ad) only provides concurrency based on TCP layer
    headers thus multiple connections are needed.  Further given multiple
    connections per session you can't have a connection per LUN as there
    will not be enough connections based on simple math.
    
    If you work the argument backwards I see a different result.  If you
    assume instead that we have one connection per LUN we should look at
    the number of concurrent connections possible.  TCP with its 16 bit
    port number limit us to 64K ports therefore only 64K active connections
    per IP interface.  Given the high bandwidth of existing drives,
    especially with the amount of cache appearing in controllers, just 10%
    of the possible connections active will saturate a link layer
    technology for the next few decades. Therefore to get to the range of
    that many LUNs and thus connections you will need multiple IP
    interfaces.  Even with the existing draft proposal, no sane implementor
    would throttle 10K+ LUNs with a single IP interface.
    
    I would propose that the requirement be 64K active sessions.  Given
    that requirement having a session per LUN makes sense.
    
    The next issue is performance.  I agree that to get maximal use of a
    fabric you need to exploit concurrency. The question becomes where is
    the correct place to put the concurrency.  If we follow the argument
    that we should have a session per LUN and the standard semantics of
    LUNs are in order request/response with minimal concurrency leading to
    a per connection performance requirement that is on par with today's
    link level technology and protocol implementations and will likely grow
    at the same rates. The throughput performance that is really a concern
    is the aggregate bandwidth of an initiator to multiple LUNs. With a
    session per LUN, each TCP connection can be placed on a different link
    layer channel (802.3ad) using TCP layer header information. Therefore
    the performance will scale with the link layer improvements using the
    existing link layer aggregation mechanism.  Ultimately the initiator to
    LUN bandwidth will be a host memory to storage contoller cache memory
    copy, currently being designed interconnect technology like Infiniband
    is exactly designed for such types of copies, so as storage devices
    move towards a memory to memory model the interconnects will exist.
    
    The issue no longer becomes trying to figure out how to exploit
    multi-link concurrency for a single TCP stream, but what the contents
    of a single TCP stream is, given that we know existing link technology
    will allow multiple TCP streams to be passed concurrently.  I assert
    that using a LUN is the natural level of concurrency and the
    performance demand of a single LUN can be met by existing TCP
    implementations and should scale over time.
    
    The numerical argument given claims that a single storage controller
    may have 160K concurrent connections.  That is likely to be an extreme
    case with a poorly balanced set of hardware but I will grant it for the
    sake of argument. The argument is posed that this will be too expensive
    to maintain that much TCP state. The proposed cost is ~10MB of memory
    which today will at about $1 to the cost of a box containing 10,000
    disk drives (~$1M assuming $100 drives). Not a compelling argument.
    Furthermore, if you multiplex multiple LUNs per connection you still
    need sufficient state to mux-demux requests which will be on the same
    order of magnitude as TCP state. So ultimately the "cost" argument is a
    wash.
    
    > Conclusion: one (or two) TCP connections per LU is both too many (resulting
    > in too much memory devoted to state records) and too few (insufficient
    > bandwidth for high-speed IO to controller cache).  Decoupling the number of
    > TCP connections from the number of LUs is the necessary result.
    
    I don't buy the conclusion, the amount of memory devoted to state
    records is relatively small and is actually constant regardless of
    whether the mux-demux is done at the TCP layer or session layer.  Also
    the driver for interconnect technology is memory to memory copying
    so the advances in storage technology will not likely outgrow the link
    layer.
    
    
    	-David
    	
    
    


Home

Last updated: Tue Sep 04 01:07:55 2001
6315 messages in chronological order