SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI: Flow Control



    Pierre, Julian, YP,
    OK, Does anyone know what started all this?  (not sure I want an answer.)
    
    Here is what I think I heard.
    
    Julian thinks the Draft Works as is and that he has enough Pacing stuff
    with normal TCP/IP and MaxCmdRn.  YP and Pierre, have just gone into lots
    of detail on how there is no blocking if done right.  And they seem happy.
    Now the way I translate that is, the Draft Works good enough for YP and
    Pierre, or they could not have described what they did.  Julian should be
    happy,  YP should be happy  that some folks agree with his approach and
    that the Draft Works as is, and Pierre should also be happy because some
    folks agree with him.
    
    OK, what I think this means is that the Storage Controllers that have
    enough Memory to match with the flow of data coming in to the processing
    rate of the Disks behind them, should be "All Singing, and All Dancing".
    Storage Controllers that have the potential of more data coming in then
    they have memory to match the processing rate of the Disks behind them,
    will have a bit of a problem, and have to push back from time to time.
    They  may not be as optimum at longer distances then they are at shorter
    distances.  Anyway this always seemed obvious to me.
    
    So, what I hear you all saying is that, the Draft is Fine the way it is (at
    least with regarding to Pacing/Credits etc.) is that right?
    
    So the key guy that is perhaps out of the boat is Matt, who thinks that we
    should have at least two conversations per Session run in an Asymmetric
    manor.  I have always thought that his point was valid, at least in the
    smaller memory targets, especially if there were standard NICs on the
    Sending Side.  So I for one, would like to hear from Matt.
    
    Let me ask YP, and Pierre one other thing.  How do you think your proposed
    design would operate, if the Sender was using, as Pierre calls it, "regular
    networking" and the target was using your proposed design.  Is it still
    "All Singing and All Dancing"?
    
    
    .
    .
    .
    John L. Hufferd
    Senior Technical Staff Member (STSM)
    IBM/SSG San Jose Ca
    (408) 256-0403, Tie: 276-0403
    Internet address: hufferd@us.ibm.com
    
    
    Pierre Labat <pierre_labat@hp.com>@ece.cmu.edu on 10/10/2000 04:04:37 PM
    
    Sent by:  owner-ips@ece.cmu.edu
    
    
    To:   ips@ece.cmu.edu
    cc:
    Subject:  Re: iSCSI: Flow Control
    
    
    
    julian_satran@il.ibm.com wrote:
    
    > Pierre,
    >
    > You are wrong again. When the target reopens the window - i.e., reads
    some
    > data from the
    > pipe at his end you get to put your Read command - but it goes after the
    > rest of the window and
    > window can be several megabytes.
    
    Julian,
    
    The TCP window is not a buffer on the receive side.
    On the receive side, in our case (the target) and as far as TCP segments
    arrive
    in order, there is not an opaque  FIFO containing a full window size of
    command/data
    waiting to be processed. You can avoid that.
    What the target does is: receive bytes through the TCP connection, does the
    TCP work
    and forms a iSCSI PDU. The maximum you have to store is a few TCP segments
    to re-build the PDU. As soon as the PDU is built it is processed.
    When the target wants to close the TCP window it updates accordingly the
    window and CONTINUEs to process the incoming PDUs.
    At that point you assume that the incoming PDUs are put in an opaque FIFO,
    but
    
    rather than that,  the target can process them and put the data a the right
    location in the target cache.
    Then, when the window is opened again and the read PDU comes, it is
    processed
    immediately.
    
    In fact as Y P Cheng described in a previous mail in this thread, the model
    that
    can be used for iSCSI traffic is different of the common model we have for
    regular
    TCP/IP networking although a  TCP fully complient with the  RFCs can be
    used
    for iSCSI.
    In regular TCP/IP networking the application (on the transmit side) fills
    a FIFO that the adapter empties. In our case as explained by Y P Cheng
    you replace the FIFO by an "exchange table" what i called a flat
    array. It allows you to avoid the head of queue blocking at this level.
    
    On the receive side (the target in our case) in regular networking,
    the incoming data are tossed in a FIFO by TCP. The application
    empties this FIFO and can block (in this case the FIFO grows)
    and yes, when the application unblock, it has a large amount
    of PDUs to process.
    But in the model described the application never blocks. Hence there is no
    big receive opaque FIFO on the target. In our case the application is the
    module that
    process the iSCSI pdus. The application never blocks because it is able to
    pace down
    the flow coming from the initiator with the TCP window and the command
    flow control (MaxCmdRN).
    
    Regards,
    
    Pierre
    
    
    
    >
    >
    > Pierre Labat <pierre_labat@hp.com> on 10/10/2000 19:50:48
    >
    > Please respond to Pierre Labat <pierre_labat@hp.com>
    >
    > To:   ips@ece.cmu.edu
    > cc:
    > Subject:  Re: iSCSI: Flow Control
    >
    > Julian_Satran@il.ibm.com wrote:
    >
    > > Pierre,
    > >
    > > The only point you are missing is that the TCP window may be closed
    when
    > > you want to send your
    > > Read command
    >
    > Julian,
    >
    > Yes, but as soon as the target re-open the window it receives the read
    > first.
    >
    > > and even if not it will reach the other end after all the data
    > > before it
    > > regardless of how clever your adapter is.
    >
    > The time used to reach the other end of the wire (for the read in our
    case)
    > is the same if there was data sent on the wire before or not. On the
    > target, as soon as the read is sampled from the wire it can be
    > processed.
    >
    > Regards,
    >
    > Pierre
    >
    > > The FIFO you have in mind is
    > > certainly not
    > > equivalent to the pipe capacity.
    > >
    > > Julo
    > >
    > > Pierre Labat <pierre_labat@hp.com> on 10/10/2000 02:58:41
    > >
    > > Please respond to Pierre Labat <pierre_labat@hp.com>
    > >
    > > To:   ips@ece.cmu.edu
    > > cc:
    > > Subject:  Re: iSCSI: Flow Control
    > >
    > > Julian_Satran@il.ibm.com wrote:
    > >
    > > > Pierre,
    > > >
    > > > It does not matter how from where you send the data on the wire.
    > > > If you have a long wire and you want to cover the latency you will
    > > > send data as soon as you can and then commands get stuck  behind.
    > >
    > > Julian,
    > >
    > > The command can NOT  be stuck because there is "data on the wire".
    > > Let me give you an example,
    > > Let's talk again about the "pull model" adapter on the initiator.
    > > Imagine you have 100Mbytes of (write) data outstanding
    > > because 1000 cmds of large write commands have been posted to
    > > the adapter.
    > > The adapter sends this data as fast as it can. But very important,
    > > the data are not tossed in any kind of buffer on the adapter.
    > > What the adapter does is: pull some kbytes of data form host memory,
    > > encapsulate it, send it on the wire. Again and again, as fast as it
    can.
    > >
    > > Now, imagine that a read is posted to the adapter after the 1000
    writes.
    > > Here is the point. The interface between the host and the adapter is
    not
    > > a FIFO but a flat array and the adapter can works in parallel on
    > > all the commands. Immediately when the host posts the read
    > > (in the flat array), the adapter sees it. The adapter as soon as it
    > > completes transmitting the current data PDU, sends the read command.
    > >
    > > The read command is not stuck behind the 100Mbytes of data.
    > > The maximum latency for the command is the time to
    > > transmit one iSCSI pdu on the wire.
    > > That is (size of pdu)/throughput.
    > >  Then the adapter continues to send the write data of the
    > > 100Mbytes. And as soon as a new command will be posted,
    > > it will send a command pdu immediately after the current
    > > data PDU.
    > >
    > > Commands are not stuck behind data because there is no FIFO
    > > before the wire, and because data "on the wire" doesn't block anything.
    > > The wire is always able to deliver its throughput.
    > >
    > > Regards,
    > >
    > > Pierre
    > >
    > > >
    > > >
    > > > And nobody is suggesting you should park the data on the NIC card if
    > > > you know better.
    > > >
    > > > Julo
    > > >
    > > > Pierre Labat <pierre_labat@hp.com> on 09/10/2000 20:41:14
    > > >
    > > > Please respond to Pierre Labat <pierre_labat@hp.com>
    > > >
    > > > To:   Julian Satran/Haifa/IBM@IBMIL
    > > > cc:
    > > > Subject:  Re: iSCSI: Flow Control
    > > >
    > > > julian_satran@il.ibm.com wrote:
    > > >
    > > > > Pierre,
    > > > >
    > > > > Sorry I missed a point about a - I though you where saying that
    > > > unsolicited
    > > > > data
    > > > > are not allowed. On this we are in agreement.
    > > > >
    > > > > On the rest - I can hardly follow. The model you suggest while
    valid
    > in
    > > a
    > > > > close
    > > > > scheme like a bus or short serial connection - in which the target
    > > > fetches
    > > > > data is closely matched by th R2T for data with no such match for
    > > > commands.
    > > > > Keeping track of how many commands where shipped for what LU is
    > > > impractical
    > > > > as we don't what per-LU state at the initiator (for the same reason
    > we
    > > > > rejected
    > > > > the connection per LU model).
    > > > >
    > > > > As for D - the point is that when you have a command to send and
    the
    > > > > command window
    > > > > is open you might have to wait a long time as the TCP window is
    > closed
    > > > > and/or you have
    > > > > a lot of data ahead.
    > > >
    > > > I think there is a misunderstanding about the model i was talking
    > about.
    > > > It's a pull model as implemented in some FC cards today and it is
    > assumed
    > > > that
    > > >
    > > > TCP/IP is handled on the adapter. It is the "no memory on adapter"
    > model
    > > > Somesh talked about.
    > > >
    > > > When a command comes out the SCSI layer, it is posted to the adapter.
    > > > At this point it is not posted in a queue but in a flat array of
    > > commands.
    > > > The data is till in host memory.
    > > > Let's assume the card can handle 1000 commands in parallel, the array
    > > > has 1000 entries.
    > > > The adapter is able to process this commands the way it wants
    > > > as far as it respects the protocol (iSCSI in our case). It could
    > > > be able to process them all in parallel if needed.
    > > > As it is a flat array, no commands are blocked by an other commands
    > > > or data. The adapter can pick (pull) whatever command or data
    > > > from host memory and send
    > > > it on the wire (again as far as it respect the protocol).
    > > >
    > > > Regards,
    > > >
    > > > Pierre
    
    
    
    


Home

Last updated: Tue Sep 04 01:06:43 2001
6315 messages in chronological order