SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: Framing Discussion



    Michael,
    
    > At 02:26 PM 12/20/2000 -0800, Douglas Otis wrote:
    > >Michael,
    > >
    > > > At 10:55 AM 12/20/2000 -0800, Douglas Otis wrote:
    > > > >JP,
    > > > >
    > > > >The VM page flipping will require blocks to be greater than
    > or equal to
    > > > >pages in size and that these blocks are aligned at page
    > > > boundaries.  Neither
    > > > >of these assumptions are true. The goal is clear, the NIC will
    > > > examine the
    > > >
    > > > It is more accurate to state that blocks are not always aligned since
    > > > clearly one can make them aligned in a given implementation and
    > > > quite often
    > > > they are.
    > >
    > >Aligned to what?  With iSCSI there is a PDU header that must be
    > stripped.  I
    > >assume you expect this header to vanish?
    >
    > Headers often "vanish" via header / data split - been there, done
    > that many
    > years ago.  Many others do it today for variety of workloads
    > operating over
    > an IP-based protocol suite including TCP (e.g. NFS).
    
    With respect to the removal of IP/TCP headers being part of the TCP
    specifications, I agree.  To then include the iSCSI PDU as part of this
    normal stripping process, then the handling of this header is being done by
    the application.  You are suggesting a application specific means to be used
    to allow this type of alignment.  With that said, you still suffer the
    problem of the block size being less than the page size.  Unless the SCSI
    application is forced to allocate in pages and you have a means to force the
    alignment of these blocks as they are delivered by the network, then this
    MMU technique is not available.
    
    > > > >content and then direct data payload to the system through the NIC
    > > > >interface.  The desire is to keep the buffers on the NIC
    > small and thus
    > > > >allow out of sequence processing of the TCP stream.  This must
    > > > be seen as a
    > > > >modification to the normal TCP implementation.  Such operation should
    > > > >include a complete description of the API to allow
    > consideration of NIC
    > > > >design, inter-operability and security requirements.
    > > >
    > > > Need to separate out completion and ACK generation semantics
    > from how and
    > > > the order that data is DMA'ed into the target buffers.
    > >
    > >Yes, there is a problem in that this data is sent from the
    > target in random
    > >order with respect to command sequence.  The SCSI command tag
    > must be used
    > >to associate data payload with the correct buffer.  This is not
    > part of the
    > >current TCP implementation.
    >
    > Correct.  It is not part of a TCP implementation; it is part of iSCSI and
    > therefore independent.  Also, there are TCP implementations that
    > do support
    > per connection buffer management (recall Solaris doing this a number of
    > years ago with fastbuf support as an example) so that is not new to the
    > implementation space and is clearly outside the TCP protocol
    > definition and
    > workgroup area of concern.
    
    You are suggesting that you use a standard TCP, find the PDU, process the
    iSCSI headers, place the data according to the tag within the iSCSI header,
    and because this process is located on the network adapter, it does not
    impact TCP.  I think I understand the logic.  As you expect the PDU to be
    "often" segment aligned, framing is not an issue.
    
    > >You are describing something to do with out of sequence segment delivery
    > >which is different from the target responding out of sequence to the
    > >commands.  You start with an invalid assumption.  If you are
    > discussing Zero
    > >Copy TCP, then this operation does not include the extraction of the data
    > >from the encapsulation, nor does it suffer from out of sequence TCP
    > >segments.  You miss the point and why I include the term Content Directed
    > >Placement.
    >
    > A zero-copy TCP is a content directed placement of the data.  Content
    > directed placement implicitly implies header / data split including the
    > iSCSI protocol headers.  Don't see what the problem is.
    
    Not at all.  The content of the payload does not impact the placement with
    Zero Copy TCP.  Here you expand the definition of this interface to now
    include iSCSI.  Rather than using TCP as the boundary between the adapter
    and the system, iSCSI is now such a boundary.
    
    > > > Now, when valid-out-order packets arrive TCP should initiate its error
    > > > recovery algorithms as appropriate.  Implementations are also
    > required to
    > > > insure that the buffer completion event is not initiated until
    > > > all packets
    > > > have arrived.  From the application perspective it is completely
    > > > opaque as
    > > > to the order the buffer is filled in and is only focused on the
    > > > completion
    > > > event to know when it may examine the buffer's contents.
    > >
    > >You are confused as to the effort required to implement a means to direct
    > >the encapsulated data to the buffer allocated within the SCSI
    > request.  We
    > >are not discussing Zero Copy TCP.  We are discussing Content Directed
    > >Placement.  If it where not for this desire, there would be no reason to
    > >discover the PDU with missing segments.
    >
    > Still not an issue.  The PDU with the missing segment is recovered by TCP
    > and the iSCSI session does not complete any out-of-order PDUs until this
    > occurs. The buffers are therefore opaque to the application and
    > the adapter
    > is kept thin and fast.
    
    I did not suggest it can not be done, only that to intelligently discuss
    this issue, we need to accurately discuss how such an adapter interface will
    operate.  The interface is not the same as Zero Copy TCP.  It is to be iSCSI
    and not TCP.  Moving the iSCSI application to the adapter seems to dodge the
    bullet with respect to TCP conflicts.  I have David Black's assurance on
    this.
    
    > > > I see nothing here that requires any API modifications to TCP-based
    > > > applications nor any issue with interoperability or security
    > since all of
    > > > this activity is performed within the local receiving
    > endnode.  I see no
    > > > problem with a host-based implementation communicating with a
    > TOE-based
    > > > implementation since there are no wire protocol modifications
    > > > w.r.t. TCP -
    > > > the iSCSI headers are within TCP's data payload byte stream and are
    > > > therefore independent of the TCP implementation itself.  What am
    > > > I missing
    > > > since you have brought up these issues on multiple occasions?
    > >
    > >If there was just a simple desire to do Zero Copy TCP, then
    > sequence numbers
    > >within TCP provides information needed to allow out of sequence
    > processing
    > >of the simple TCP stream.  This is NOT the desire in this case.
    > They wish
    > >to extract data within the TCP stream as David Black described as a
    > >(look-ahead) packet filter to deliver the SCSI data to the correct buffer
    > >without the need to copy this data out of the TCP stream following a TCP
    > >Zero Copy.  Even if you provided a true TCP Zero Copy, for SCSI
    > there would
    > >be one more additional copy required.  It would be the extraction of the
    > >encapsulated data from stream and placement into the allocated
    > SCSI buffers.
    > >It makes no sense to be able to process SCSI PDUs out of sequence if this
    > >were not the case.  This requires the exchange of buffer descriptors with
    > >the network adapter that is not part of TCP today.
    >
    > It is part of TCP in a variety of implementations.  All of this is
    > implementation specific and the techniques have been used for years.  The
    > fact that some implementations are not performance oriented is a separate
    > issue.  They can work by scanning and doing the copies if they desire or
    > they can get smart in the way they interact with a semi-smart (not even
    > fully off-loaded TCP/IP implementation) adapter and operate as
    > many on this
    > reflector have advocated.  Again, I do not see any need to perform
    > additional copies as contended as this can be performed with some
    > intelligence in the buffer management and DMA algorithms.  Fairly
    > straight-forward and was done on other adapters supporting TCP
    > back in 1992
    > on HP-UX and there has been numerous research and other implementations
    > that support similar algorithms.  Why is iSCSI so difficult if others can
    > do this for protocols such as NFS?
    >
    > Mike
    
    Fine, but you really should not call this part of TCP.  It is not.  You
    should call it an iSCSI adapter interface.  If you wish such an interface to
    be application specific, so will its interface.  Clearly SCSI dictates a
    different interface to that of TCP.  If you advocate the placement of the
    application on the adapter, then you are no longer discussing TCP.  You are
    discussing the back-end of a full or partial processing of iSCSI.  TCP is
    hidden within this application running on the adapter.  I had advocated
    using a pointer to an application routine to serve this function of placing
    the application on the adapter and should the value be less than 1024, it
    would indicate a pre-defined IANA designator for such an embedded
    application.  Perhaps the value of 1 could indicate a normal IP stack as the
    application as you bind the application to the port.  If this is the desire
    of the work group, as it seems to be, then declaring details of the
    application adapter interface is the next step.  What group makes adapters?
    
    Doug
    
    
    


Home

Last updated: Tue Sep 04 01:06:02 2001
6315 messages in chronological order