SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: Framing Discussion



    At 02:26 PM 12/20/2000 -0800, Douglas Otis wrote:
    >Michael,
    >
    > > At 10:55 AM 12/20/2000 -0800, Douglas Otis wrote:
    > > >JP,
    > > >
    > > >The VM page flipping will require blocks to be greater than or equal to
    > > >pages in size and that these blocks are aligned at page
    > > boundaries.  Neither
    > > >of these assumptions are true. The goal is clear, the NIC will
    > > examine the
    > >
    > > It is more accurate to state that blocks are not always aligned since
    > > clearly one can make them aligned in a given implementation and
    > > quite often
    > > they are.
    >
    >Aligned to what?  With iSCSI there is a PDU header that must be stripped.  I
    >assume you expect this header to vanish?
    
    Headers often "vanish" via header / data split - been there, done that many 
    years ago.  Many others do it today for variety of workloads operating over 
    an IP-based protocol suite including TCP (e.g. NFS).
    
    
    > > >content and then direct data payload to the system through the NIC
    > > >interface.  The desire is to keep the buffers on the NIC small and thus
    > > >allow out of sequence processing of the TCP stream.  This must
    > > be seen as a
    > > >modification to the normal TCP implementation.  Such operation should
    > > >include a complete description of the API to allow consideration of NIC
    > > >design, inter-operability and security requirements.
    > >
    > > Need to separate out completion and ACK generation semantics from how and
    > > the order that data is DMA'ed into the target buffers.
    >
    >Yes, there is a problem in that this data is sent from the target in random
    >order with respect to command sequence.  The SCSI command tag must be used
    >to associate data payload with the correct buffer.  This is not part of the
    >current TCP implementation.
    
    Correct.  It is not part of a TCP implementation; it is part of iSCSI and 
    therefore independent.  Also, there are TCP implementations that do support 
    per connection buffer management (recall Solaris doing this a number of 
    years ago with fastbuf support as an example) so that is not new to the 
    implementation space and is clearly outside the TCP protocol definition and 
    workgroup area of concern.
    
    
    >You are describing something to do with out of sequence segment delivery
    >which is different from the target responding out of sequence to the
    >commands.  You start with an invalid assumption.  If you are discussing Zero
    >Copy TCP, then this operation does not include the extraction of the data
    >from the encapsulation, nor does it suffer from out of sequence TCP
    >segments.  You miss the point and why I include the term Content Directed
    >Placement.
    
    A zero-copy TCP is a content directed placement of the data.  Content 
    directed placement implicitly implies header / data split including the 
    iSCSI protocol headers.  Don't see what the problem is.
    
    
    > > Now, when valid-out-order packets arrive TCP should initiate its error
    > > recovery algorithms as appropriate.  Implementations are also required to
    > > insure that the buffer completion event is not initiated until
    > > all packets
    > > have arrived.  From the application perspective it is completely
    > > opaque as
    > > to the order the buffer is filled in and is only focused on the
    > > completion
    > > event to know when it may examine the buffer's contents.
    >
    >You are confused as to the effort required to implement a means to direct
    >the encapsulated data to the buffer allocated within the SCSI request.  We
    >are not discussing Zero Copy TCP.  We are discussing Content Directed
    >Placement.  If it where not for this desire, there would be no reason to
    >discover the PDU with missing segments.
    
    Still not an issue.  The PDU with the missing segment is recovered by TCP 
    and the iSCSI session does not complete any out-of-order PDUs until this 
    occurs. The buffers are therefore opaque to the application and the adapter 
    is kept thin and fast.
    
    
    > > I see nothing here that requires any API modifications to TCP-based
    > > applications nor any issue with interoperability or security since all of
    > > this activity is performed within the local receiving endnode.  I see no
    > > problem with a host-based implementation communicating with a TOE-based
    > > implementation since there are no wire protocol modifications
    > > w.r.t. TCP -
    > > the iSCSI headers are within TCP's data payload byte stream and are
    > > therefore independent of the TCP implementation itself.  What am
    > > I missing
    > > since you have brought up these issues on multiple occasions?
    >
    >If there was just a simple desire to do Zero Copy TCP, then sequence numbers
    >within TCP provides information needed to allow out of sequence processing
    >of the simple TCP stream.  This is NOT the desire in this case.  They wish
    >to extract data within the TCP stream as David Black described as a
    >(look-ahead) packet filter to deliver the SCSI data to the correct buffer
    >without the need to copy this data out of the TCP stream following a TCP
    >Zero Copy.  Even if you provided a true TCP Zero Copy, for SCSI there would
    >be one more additional copy required.  It would be the extraction of the
    >encapsulated data from stream and placement into the allocated SCSI buffers.
    >It makes no sense to be able to process SCSI PDUs out of sequence if this
    >were not the case.  This requires the exchange of buffer descriptors with
    >the network adapter that is not part of TCP today.
    
    It is part of TCP in a variety of implementations.  All of this is 
    implementation specific and the techniques have been used for years.  The 
    fact that some implementations are not performance oriented is a separate 
    issue.  They can work by scanning and doing the copies if they desire or 
    they can get smart in the way they interact with a semi-smart (not even 
    fully off-loaded TCP/IP implementation) adapter and operate as many on this 
    reflector have advocated.  Again, I do not see any need to perform 
    additional copies as contended as this can be performed with some 
    intelligence in the buffer management and DMA algorithms.  Fairly 
    straight-forward and was done on other adapters supporting TCP back in 1992 
    on HP-UX and there has been numerous research and other implementations 
    that support similar algorithms.  Why is iSCSI so difficult if others can 
    do this for protocols such as NFS?
    
    Mike
    
    


Home

Last updated: Tue Sep 04 01:06:02 2001
6315 messages in chronological order