SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: Status summary on multiple connections



    
    ------------- Begin Forwarded Message -------------
    
    From: Robert Snively <rsnively@Brocade.COM>
    To: "'David Robinson'" <David.Robinson@EBay.Sun.COM>, Robert Snively 
    <rsnively@Brocade.COM>
    Subject: RE: Status summary on multiple connections
    Date: Thu, 28 Sep 2000 09:36:50 -0700
    
    Dave,
    
    You ask some interesting questions with non-short answers.  If you would
    like, and if you think it useful, you may post this out to the ips reflector.
    
    >  > The single connection alternative allows a simplistic ordering
    >  > structure, a simple recovery mechanism, and does not require
    >  > state sharing among multiple NICs.  It allows bandwidth aggregation
    >  > across any set of boundaries that is required.  Because command
    >  > queuing is the rule among high performance SCSI environments,
    >  > latency appears only as an increment in host buffer requirements
    >  > except during writes that perform a commit function.  
    >  Those traditionally
    >  > have been taken out of the performance path by using local
    >  > non-volatile RAM to perform the commit functions, using slower
    >  > high latency writes with less strict ordering requirements relative
    >  > to reads to actually perform the write to media.
    >  
    >  Can you clarify something for me, in my previous questions on
    >  flow control it was strongly indicated that the target must drain
    >  the stream in order to allow commands to flow when the command queue
    >  filled up.  You seem to indicate here that command queuing and the
    >  flow control needed to handle overflow is being done at the
    >  SCSI layer and not the transport.  Is it correct that it really
    >  is a command level function and not a transport function? Without
    >  considering the TCP window management, SCSI will cause 
    >  command flow to
    >  stop when the queue fills up?  If this is true then most of
    >  the arguments for at least two connections are no relivent.  You
    >  may need to still discard commands if the target over advertises its
    >  total queue space but that seems to be more of an implementation
    >  bug.
    
    SCSI manages two independent sets of resources.  One is the resources
    required to receive and process command states.  The other is the
    resources required to buffer and process data to be transferred
    as a result of processing the commands.
    
    At the initiator, all resources for the execution of a command,
    including both the command state resources and the explicitly specified
    buffer area, are defined at the time the command is delivered to
    the SCSI stack.  Those resources are locked down until the SCSI
    command is finished, at which time the command state resources (by this
    time a response packet) and control of the buffer are passed back
    to the application client (user, driver, operating system, file system,
    application program or whatever).
    
    The beauty of SCSI is that all the transfer management is done by
    the target (which knows exactly what is going on and exactly what is
    needed), not by the initiator.
    
    The target also has two sets of resources.  The command is received
    into a command buffer (perhaps implemented as a large single buffer or
    perhaps implemented as a large number of smaller buffers at each logical
    unit).  The rules on the command queueing I explained before, but
    basically all commands are posted into the same buffer from whatever
    initiator they were received from with some kind of time/order stamp.
    A well-behaved device is always capable of receiving at least one
    simple or ordered queued command and one head of queue command for
    each logical unit/initiator nexus that is supported by the device.
    Present devices support from 16 to 64 initiators per logical unit.
    Typically at least one additional slot is available for task management
    functions.
    The remaining locations for commands in the queue are dynamically
    portioned out to whatever commands come in, regardless of initiator
    or logical unit.  When there is no more dynamic space left and all the
    pre-allocated locations for a particular ITL nexus are also full, 
    the next command gets a queue full indication returned.  Because of
    the dynamic assignment area, this will typically be rare in a properly
    configured system.  The initiator then resends the command and all
    subsequent commands after at least one command comes back completed,
    indicating that at least one (and probably a whole stack more) slots
    are again available.  Note that there is a possibility that commands
    that are inflight and have ordering constraints may be accepted out of
    order, a question that has caused lots of agonizing, but is apparently
    reasonably well managed by most file systems today by the selective
    use of ordering only for blocking boundaries of a particular logical
    stream of commands.
    
    The target then begins sorting commands for optimum execution order,
    to exploit pre-buffered data, and to perform any coalescence of 
    streaming operations to the device and begins to execute the commands
    in ITS desired order, modified by the ordered queueing restrictions, if
    any.  If data is required from the initiator, buffers
    are set aside for the data in the target and the data (already locked
    down in the initiator buffers) is requested from the initiator.
    If data is to be sent to the initiator, it is assembled in the target
    buffers and shipped off to the specified buffers in the initiator.
    In large storage subsystems, this is typically going on for multiple
    initiators and in both directions at the same time.
    The initiator buffers are identified by the command context in the
    initiator.  The command context is selected by the 
    Initiator/Target/Logical unit/Queue Tag (ITLQ) nexus carried from
    the target with the data or the data request.
    
    As a result, SCSI, independent of transport (IEEE 1394, Parallel SCSI,
    FC, and I hope iSCSI) has complete flow control at the initiator and
    at the target with respect to all command and data transfers.
    
    SCSI, being a storage protocol and having bursty traffic characteristics,
    is traditionally configured such that over-subscription is of
    short duration and has little effect on average latencies except for
    very short periods.  Of course, underconfigured SCSI transports
    will increase latency, but they will still be well-behaved in terms
    of throughput and they will not block.  Depending on the particular 
    transport implementation, they may be more or less well-behaved in 
    terms of IT nexus fairness.  As an example, older parallel SCSI
    implementations may exhibit higher throughput on high priority IT 
    nexi than on low priority IT nexi.
    
    However, if additional flow controls or congestion management
    exist in the transport layer, they can interact in some pathological
    ways with the basic SCSI function.  I believe it is possible that
    such structures could create head of queue blocking or throttling
    behaviors in the transport switches if those mechanisms are not
    implemented properly.  Note that this is 100% outside the scope of
    the SCSI behaviors.
    
    I believe that the conclusions in your note are well founded.
    
     
    
    ------------- End Forwarded Message -------------
    
    
    


Home

Last updated: Tue Sep 04 01:07:00 2001
6315 messages in chronological order