SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: Status summary on multiple connections



    I think the layering is being blurred.  Remember, whether there is RDMA or
    not, TCP is the layer doing the ACKing, *NOT* iSCSI.  And whether or not there
    is RDMA or not, TCP cannot ACK TCP segments that arrive out of order until the
    missing ones arrive (it can SACK them, but it can't ACK them - and SACK is
    only informative).
    
    iSCSI cannot deliver commands out of order to the SCSI layer.  So even if the
    command is RDMA'd into a buffer, it cannot be delivered to SCSI until the
    previous commands are received.
    
    -Matt
    
    julian_satran@il.ibm.com wrote:
    
    > David,
    >
    > The picture I had in mind is somewhat simpler.
    >
    > I assume that on any given transport connection, with or without RDMA,
    > iSCSI
    > will get from TCP and hand to SCSI things that are in order (just to keep
    > layering iSCSI over TCP simple and clean). No retransmissions involved in
    > iSCSI here.
    >
    > In case of several connections iSCSI might get things out of order (no ack
    > for them)
    > and may decide to hand them or not to hand them to execution.
    > iSCSI will not require retransmission as we assume that the out of order
    > thing is
    > a temporary artifact of the underlying network. Recovery if needed here is
    > done
    > only when a  connection is blown-away and then all outstanding commands
    > that where shipped on that connection are restarted (resent).
    >
    > And again we conceived the window only to limit the number of commands
    > iSCSI has
    > to keep to get things in order but it can as well serve as a flow-control
    > mechanism for
    > the target as the initiator is unconcerned of what is being done with the
    > commands within the window (e.g., a R2T from a command within the window -
    > i.e., not acked yet - is a
    > legal event).
    >
    > Julo
    >
    > David Robinson <David.Robinson@EBay.Sun.COM> on 30/09/2000 01:35:11
    >
    > Please respond to David Robinson <David.Robinson@EBay.Sun.COM>
    >
    > To:   ips@ece.cmu.edu
    > cc:    (bcc: Julian Satran/Haifa/IBM)
    > Subject:  Re: Status summary on multiple connections
    >
    > There are two types of out of order processing we need to
    > consider:
    >      1) Commands that have been received and ACK'd by the transport
    >      2) Commands that have been received and not ACK'd
    >
    > In the first case this is purely a SCSI issue on if the commands
    > are ordered and how the target processes them. As far as the
    > transport is concerned the packets are "delivered".  In the second
    > case the transport is still required to track the possible
    > retransmissions and process the transport ACK for all of
    > the data when the missing segment arrives.  It will overly
    > complicate the iSCSI layer to successfully track which parts
    > of the sequence space it has processed, defend against
    > retransmissions, and handle packets that have more than one
    > command, and retransmissons that may be only part of a command.
    >
    > This is all doable in an implementation (a bit messy at times)
    > but to support this as an explicit feature we need to start
    > adding in an iSCSI layer sequence space.
    >
    > I propose that we leave the processing of transport
    > level un-ACK'd commands and data as an implementation detail
    > and not make it an explicit iSCSI feature. Dropped or reordered
    > TCP segments are rare enough in high performance environments
    > that this is really a non-issue.
    >
    >      -David
    >
    > julian_satran@il.ibm.com wrote:
    > >
    > > David,
    > >
    > > I think that RDMA out-of-order processing is important only in order not
    > to
    > > have data
    > > piling up in adapters. It does not mean that data gets really
    > "committed".
    > > The commands can be executed in or out-of-order - this is a pure SCSI
    > > story.
    > > But if they have to be executed in order the ordering provided by SCSI
    > > will enable it.
    > > And as you and others have pointed out the same mechanism is good for
    > both
    > > ordering and flow control.
    > > As far as I understand it windowing - although more expensive - is a
    > better
    > > technique over a wide variety of latencies than credits.
    > >
    > > And BTW we even considered credits for a "prefetching mechanism" for
    > > chained
    > > commands but where told that those are very much "out of fashion" (i.e.
    > not
    > > worth speeding up on Elefants).
    > >
    > > Julo
    > >
    > > David Robinson <David.Robinson@EBay.Sun.COM> on 29/09/2000 21:25:52
    > >
    > > Please respond to David Robinson <David.Robinson@EBay.Sun.COM>
    > >
    > > To:   ips@ece.cmu.edu
    > > cc:    (bcc: Julian Satran/Haifa/IBM)
    > > Subject:  RE: Status summary on multiple connections
    > >
    > > > I am left with the following impression as to what was indicated here:
    > > > - In general, command ordering is not relevant
    > > > - If the initiator filesystem detects an ordering dependency, it will
    > > wait
    > > > until outstanding commands are complete before issuing the dependant
    > > > command.
    > > >
    > > > This may be a reasonable means of operation for the disk world. It is
    > > > woefully inadequate for the tape world, as follows:
    > > > - In general, command ordering is crucial - out of order command
    > > processing
    > > > will lead to data corruption.
    > > > - This would require the initiator backup application to block on
    > > completion
    > > > of every single write command of a backup operation before issuing the
    > > next
    > > > command.
    > > >
    > > > If this blocking were performed, both the throughput and capacity of a
    > > tape
    > > > device/media would be negatively impacted by an order of magnitude or
    > > more.
    > > > This would occur even assuming an instantaneous transport.
    > >
    > > I am hearing different stories on the issue of ordering.  One side
    > > is pushing hard for techniques that will allow out of order
    > > execution using various RDMA techniques. This clearly states for
    > > a certain class of devices (e.g. tapes) ordering is crucial.
    > > I thought this problem was already solved at the SCSI layer
    > > through the use of ordered commands which in general are not used
    > > for disks but always used for tapes?  Since FC will reorder this
    > > has to be a solved problem. Would not an initiator talking to
    > > a tape target simply set the ordering flag?
    > >
    > > Lastly, for a TCP based connection ordering can easily be made a
    > > non-issue, simply don't try to process segments out of order.  I
    > > will defer to a transport expert, but I believe processing
    > > TCP segments by an application out of order might cause problems.
    > > In particular since the out of order segment is not ACKed until
    > > after the missing segments arrive, they can be retransmitted
    > > multiple times.  SACK helps this but does not guarentee that
    > > segments will not be retransmitted. So to process out of order
    > > segments the application must maintain a list of which segments
    > > have been processed as well, yuck!
    > >
    > >      -David
    
    
    
    


Home

Last updated: Tue Sep 04 01:06:55 2001
6315 messages in chronological order