SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI: Flow Control



    At 07:18 PM 10/14/00 -0700, Matt Wakeley wrote:
    >Somesh,
    >
    >I still don't understand what you are trying to solve.
    >
    >With the iSCSI session wide command credit method, there is a portion of the
    >iSCSI layer that sits right below the SCSI layer.  It receives the commands
    >from the SCSI layer and passes the results of each I/O from each NIC back to
    >the SCSI layer. The MaxCmdRn indicates how many commands the target (as a
    >whole) can "buffer". The iSCSI layer will "scatter" the commands to the NICs
    >until it has used up the MaxCmdRn buffers. Each NIC, once iSCSI has posted a
    
    The added advantage is that the policy for this "scatter" can be determined 
    outside of iSCSI and adapt to changing conditions or the attributes of the 
    individual NICs / paths.  This allows the actual scatter to be performed by 
    iSCSI with it only understanding the actual scatter algorithm.
    
    >Each Target NIC will have a poll of buffers to receive asynchronous (non DATA)
    >iSCSI messages.  As each (small) command message is received, it is placed
    >into one of these buffers, processed by common iSCSI and the CDB is passed to
    >the SCSI layer which stores it into its command buffer. The message buffer is
    >then given back to the NIC for further messages.
    
    The buffer management for this is implementation-specific but what is 
    described is one viable alternative.
    
    >In your description, the initiator still "scatters" the commands to the NICs,
    >then the NICs have the burden of trying to figure out if they can send the
    >command or not.  Furthermore, if some NICs have open TCP windows, but don't
    >have command credit, the command can't be sent.
    >
    >In the iSCSI session wide credit model, the initiator will not post commands
    >to any NIC if it doesn't have credit.  Any commands posted to a NIC will be
    >sent as long as it's TCP window is open.
    
    And it can dynamically adjust is scatter algorithm to bypass a connection 
    that is unable to make forward progress without any complexity.  It can 
    also use this bypass as a tracking mech for potential problems within the 
    connection itself, e.g. bypass N times indicates the connection may be 
    hung; probe to determine if true and initiate recovery as required.
    
    >Having a session wide MaxCmdRn allows the initiator to stop sending SCSI
    >commands, while still enabling non command messages to be sent.  They are
    >received by each NIC and passed to iSCSI for processing, but since they are
    >not passed up to SCSI, nothing is overflowed.
    
    Correct.
    
    > > 2. Have the NICs grab them from a pool through an atomic bus
    > > transaction. That has got to be tougher to implement than it
    > > looks, and the bus performance issues due to the need to maintain
    > > ordering etc?
    >
    >As indicated above, each NIC passes the iSCSI messages to a central iSCSI
    >message processor that sends the appropriate SCSI messages to SCSI.
    
    The only "red" flag is the potential scalability issue since this is done 
    through a "central" entity.  For a large SMP, central translates into poor 
    scalability.  One really would prefer to have this distributed among a set 
    of processors who operate in parallel with minimal critical regions to 
    contend.   Problems like this get worse when the ratio of processors to 
    NICs gets too large.  As we move towards 10 GbE, this ratio is likely to be 
    fairly large perhaps as high as 4:1 which with a sufficiently high IOP rate 
    can create contention and inefficiency within the endnode.
    
    Mike
    
    


Home

Last updated: Tue Sep 04 01:06:37 2001
6315 messages in chronological order