Re: iSCSI: Flow Control

To: ips@ece.cmu.edu
Subject: Re: iSCSI: Flow Control
From: "John Hufferd/San Jose/IBM" <hufferd@us.ibm.com>
Date: Tue, 10 Oct 2000 18:23:07 -0700
Content-type: text/plain; charset=us-ascii
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Pierre, Julian, YP,
OK, Does anyone know what started all this?  (not sure I want an answer.)

Here is what I think I heard.

Julian thinks the Draft Works as is and that he has enough Pacing stuff
with normal TCP/IP and MaxCmdRn.  YP and Pierre, have just gone into lots
of detail on how there is no blocking if done right.  And they seem happy.
Now the way I translate that is, the Draft Works good enough for YP and
Pierre, or they could not have described what they did.  Julian should be
happy,  YP should be happy  that some folks agree with his approach and
that the Draft Works as is, and Pierre should also be happy because some
folks agree with him.

OK, what I think this means is that the Storage Controllers that have
enough Memory to match with the flow of data coming in to the processing
rate of the Disks behind them, should be "All Singing, and All Dancing".
Storage Controllers that have the potential of more data coming in then
they have memory to match the processing rate of the Disks behind them,
will have a bit of a problem, and have to push back from time to time.
They  may not be as optimum at longer distances then they are at shorter
distances.  Anyway this always seemed obvious to me.

So, what I hear you all saying is that, the Draft is Fine the way it is (at
least with regarding to Pacing/Credits etc.) is that right?

So the key guy that is perhaps out of the boat is Matt, who thinks that we
should have at least two conversations per Session run in an Asymmetric
manor.  I have always thought that his point was valid, at least in the
smaller memory targets, especially if there were standard NICs on the
Sending Side.  So I for one, would like to hear from Matt.

Let me ask YP, and Pierre one other thing.  How do you think your proposed
design would operate, if the Sender was using, as Pierre calls it, "regular
networking" and the target was using your proposed design.  Is it still
"All Singing and All Dancing"?


.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
(408) 256-0403, Tie: 276-0403
Internet address: hufferd@us.ibm.com


Pierre Labat <pierre_labat@hp.com>@ece.cmu.edu on 10/10/2000 04:04:37 PM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:
Subject:  Re: iSCSI: Flow Control



julian_satran@il.ibm.com wrote:

> Pierre,
>
> You are wrong again. When the target reopens the window - i.e., reads
some
> data from the
> pipe at his end you get to put your Read command - but it goes after the
> rest of the window and
> window can be several megabytes.

Julian,

The TCP window is not a buffer on the receive side.
On the receive side, in our case (the target) and as far as TCP segments
arrive
in order, there is not an opaque  FIFO containing a full window size of
command/data
waiting to be processed. You can avoid that.
What the target does is: receive bytes through the TCP connection, does the
TCP work
and forms a iSCSI PDU. The maximum you have to store is a few TCP segments
to re-build the PDU. As soon as the PDU is built it is processed.
When the target wants to close the TCP window it updates accordingly the
window and CONTINUEs to process the incoming PDUs.
At that point you assume that the incoming PDUs are put in an opaque FIFO,
but

rather than that,  the target can process them and put the data a the right
location in the target cache.
Then, when the window is opened again and the read PDU comes, it is
processed
immediately.

In fact as Y P Cheng described in a previous mail in this thread, the model
that
can be used for iSCSI traffic is different of the common model we have for
regular
TCP/IP networking although a  TCP fully complient with the  RFCs can be
used
for iSCSI.
In regular TCP/IP networking the application (on the transmit side) fills
a FIFO that the adapter empties. In our case as explained by Y P Cheng
you replace the FIFO by an "exchange table" what i called a flat
array. It allows you to avoid the head of queue blocking at this level.

On the receive side (the target in our case) in regular networking,
the incoming data are tossed in a FIFO by TCP. The application
empties this FIFO and can block (in this case the FIFO grows)
and yes, when the application unblock, it has a large amount
of PDUs to process.
But in the model described the application never blocks. Hence there is no
big receive opaque FIFO on the target. In our case the application is the
module that
process the iSCSI pdus. The application never blocks because it is able to
pace down
the flow coming from the initiator with the TCP window and the command
flow control (MaxCmdRN).

Regards,

Pierre



>
>
> Pierre Labat <pierre_labat@hp.com> on 10/10/2000 19:50:48
>
> Please respond to Pierre Labat <pierre_labat@hp.com>
>
> To:   ips@ece.cmu.edu
> cc:
> Subject:  Re: iSCSI: Flow Control
>
> Julian_Satran@il.ibm.com wrote:
>
> > Pierre,
> >
> > The only point you are missing is that the TCP window may be closed
when
> > you want to send your
> > Read command
>
> Julian,
>
> Yes, but as soon as the target re-open the window it receives the read
> first.
>
> > and even if not it will reach the other end after all the data
> > before it
> > regardless of how clever your adapter is.
>
> The time used to reach the other end of the wire (for the read in our
case)
> is the same if there was data sent on the wire before or not. On the
> target, as soon as the read is sampled from the wire it can be
> processed.
>
> Regards,
>
> Pierre
>
> > The FIFO you have in mind is
> > certainly not
> > equivalent to the pipe capacity.
> >
> > Julo
> >
> > Pierre Labat <pierre_labat@hp.com> on 10/10/2000 02:58:41
> >
> > Please respond to Pierre Labat <pierre_labat@hp.com>
> >
> > To:   ips@ece.cmu.edu
> > cc:
> > Subject:  Re: iSCSI: Flow Control
> >
> > Julian_Satran@il.ibm.com wrote:
> >
> > > Pierre,
> > >
> > > It does not matter how from where you send the data on the wire.
> > > If you have a long wire and you want to cover the latency you will
> > > send data as soon as you can and then commands get stuck  behind.
> >
> > Julian,
> >
> > The command can NOT  be stuck because there is "data on the wire".
> > Let me give you an example,
> > Let's talk again about the "pull model" adapter on the initiator.
> > Imagine you have 100Mbytes of (write) data outstanding
> > because 1000 cmds of large write commands have been posted to
> > the adapter.
> > The adapter sends this data as fast as it can. But very important,
> > the data are not tossed in any kind of buffer on the adapter.
> > What the adapter does is: pull some kbytes of data form host memory,
> > encapsulate it, send it on the wire. Again and again, as fast as it
can.
> >
> > Now, imagine that a read is posted to the adapter after the 1000
writes.
> > Here is the point. The interface between the host and the adapter is
not
> > a FIFO but a flat array and the adapter can works in parallel on
> > all the commands. Immediately when the host posts the read
> > (in the flat array), the adapter sees it. The adapter as soon as it
> > completes transmitting the current data PDU, sends the read command.
> >
> > The read command is not stuck behind the 100Mbytes of data.
> > The maximum latency for the command is the time to
> > transmit one iSCSI pdu on the wire.
> > That is (size of pdu)/throughput.
> >  Then the adapter continues to send the write data of the
> > 100Mbytes. And as soon as a new command will be posted,
> > it will send a command pdu immediately after the current
> > data PDU.
> >
> > Commands are not stuck behind data because there is no FIFO
> > before the wire, and because data "on the wire" doesn't block anything.
> > The wire is always able to deliver its throughput.
> >
> > Regards,
> >
> > Pierre
> >
> > >
> > >
> > > And nobody is suggesting you should park the data on the NIC card if
> > > you know better.
> > >
> > > Julo
> > >
> > > Pierre Labat <pierre_labat@hp.com> on 09/10/2000 20:41:14
> > >
> > > Please respond to Pierre Labat <pierre_labat@hp.com>
> > >
> > > To:   Julian Satran/Haifa/IBM@IBMIL
> > > cc:
> > > Subject:  Re: iSCSI: Flow Control
> > >
> > > julian_satran@il.ibm.com wrote:
> > >
> > > > Pierre,
> > > >
> > > > Sorry I missed a point about a - I though you where saying that
> > > unsolicited
> > > > data
> > > > are not allowed. On this we are in agreement.
> > > >
> > > > On the rest - I can hardly follow. The model you suggest while
valid
> in
> > a
> > > > close
> > > > scheme like a bus or short serial connection - in which the target
> > > fetches
> > > > data is closely matched by th R2T for data with no such match for
> > > commands.
> > > > Keeping track of how many commands where shipped for what LU is
> > > impractical
> > > > as we don't what per-LU state at the initiator (for the same reason
> we
> > > > rejected
> > > > the connection per LU model).
> > > >
> > > > As for D - the point is that when you have a command to send and
the
> > > > command window
> > > > is open you might have to wait a long time as the TCP window is
> closed
> > > > and/or you have
> > > > a lot of data ahead.
> > >
> > > I think there is a misunderstanding about the model i was talking
> about.
> > > It's a pull model as implemented in some FC cards today and it is
> assumed
> > > that
> > >
> > > TCP/IP is handled on the adapter. It is the "no memory on adapter"
> model
> > > Somesh talked about.
> > >
> > > When a command comes out the SCSI layer, it is posted to the adapter.
> > > At this point it is not posted in a queue but in a flat array of
> > commands.
> > > The data is till in host memory.
> > > Let's assume the card can handle 1000 commands in parallel, the array
> > > has 1000 entries.
> > > The adapter is able to process this commands the way it wants
> > > as far as it respects the protocol (iSCSI in our case). It could
> > > be able to process them all in parallel if needed.
> > > As it is a flat array, no commands are blocked by an other commands
> > > or data. The adapter can pick (pull) whatever command or data
> > > from host memory and send
> > > it on the wire (again as far as it respect the protocol).
> > >
> > > Regards,
> > >
> > > Pierre
Prev by Date: Compatible mapping between servers for failure recovery.
Next by Date: Re: iSCSI: Flow Control
Prev by thread: RE: iSCSI: Flow Control
Next by thread: Re: iSCSI: Flow Control
Index(es):
- Date
- Thread
Home
Last updated: Tue Sep 04 01:06:43 2001
6315 messages in chronological order