RE: RE: Command Queue Depth (was asymmetric/Symmetric)

To: cmonia@NishanSystems.com, Jim.McGrath@quantum.com, julian_satran@il.ibm.com
Subject: RE: RE: Command Queue Depth (was asymmetric/Symmetric)
From: somesh_gupta@hp.com
Date: Fri, 8 Sep 2000 11:25:21 -0700
Cc: ips@ece.cmu.edu
Content-Disposition: inline; filename="BDY.TXT"
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII; name="BDY.TXT"
Sender: owner-ips@ece.cmu.edu
I think you and Charles have important points on this.

First it is really important to reexamine the issue of communicating
command queue depth given the way TCP/IP works, and also the
fact that memory is much much cheaper than it used to be and
is probably going to get cheaper still.

If we have a way to indicate the command queue depth (like the
sequence # mechanism in iSCSI), this lets the target provide an
indication of the command queue depth to the initiator (the TCP
window size provided by the target to the initiator will indicate
how many bytes of data and commands can be pumped on this
connection in that direction).

If the device is a small device (perhaps like a drive), it will
have very few connections, maybe one. In this case it has to
make sure that the credit it extends is based on actual
command buffer space (and the same goes for window size),
(and perhaps processing rate and network latency?). If however,
it is a large device with a large number of connections, it will
have to do some over-subscription. And if there starts to be
a backup of commands or data, the command queue depth indication
and the TCP window size can start to be pinched. 

And of course, the worse case is when things overflow, and then
have to be tossed. In this case, as someone else recommended,
an indication that overflow has occurred should be sent to
the initiator. Then all the commands get thrown away till an
indication is received from the initiator that it is going to
start resending the commands from the point at which the commands
were thrown away. Again the command sequence #s come in handy.
There is little need to optimize this last part through SACKs
etc, hoping that the flow control provided will avoid this
situation from happening too many times.

Somesh


> -----Original Message-----
> From: Jim.McGrath@quantum.com [mailto:Jim.McGrath@quantum.com]
> Sent: Thursday, September 07, 2000 8:44 PM
> To: cmonia@NishanSystems.com; julian_satran@il.ibm.com
> Cc: ips@ece.cmu.edu
> Subject: FW: RE: Command Queue Depth (was asymmetric/Symmetric)
> 
> 
> 
> The path I'd recommend is to allow people to oversubscribe a target's
> resources, and then to do a graceful recovery when that gets you into
> trouble.
> 
> Note that drive designers do this all of the time - we tend 
> to optimize for
> common cases, and then worry about how to handle outlyers using other
> mechanisms.  While a more complicated model, it gets you the 
> best overall
> resource utilization.  We use read on arrival, ECC on the 
> Fly, Retrys, auto
> reallocation, all in attempts to handle the common path 
> quickly and the rare
> path more slowly, where the differences are error rates.  
> 
> In this case I would allow initiators to send down data 
> immediately  - when
> that works (like when ECC on the Fly works) you get a 
> benefit.  If packets
> are dropped you can rely on existing mechanisms to recover, 
> or you can put
> in a new, improved, and perhaps more friendly process.  In 
> either case I
> think the result would probably be better than a tight credit 
> based model
> where a lot of delays would be introduced and a lot of (historically
> unresolved) allocation policy issues arise.
> 
> Jim
> 
> 
> 
> -----Original Message-----
> From: Charles Monia [mailto:cmonia@NishanSystems.com]
> Sent: Thursday, September 07, 2000 1:52 PM
> To: Julian Satran (E-mail)
> Cc: Ips (E-mail)
> Subject: RE: Command Queue Depth (was asymmetric/Symmetric)
> 
> 
> Hi Julo:
> 
> > -----Original Message-----
> > From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> > Sent: Thursday, September 07, 2000 6:11 AM
> > To: ips@ece.cmu.edu
> > Subject: RE: Command Queue Depth (was asymmetric/Symmetric)
> > 
> > 
> >  
> > Dear colleagues,
> > 
> > Although the windowing mechanism in iSCSI-01 may seem to be 
> > there to solve
> > a queueing issue
> > it is mainly meant to limit the buffering space for commands 
> > that await
> > "de-skewing".
> > We assume that execution queue-lengths, policy etc. are 
> > beyond the scope of
> > transport.
> > 
> > As for SCSI queue length I assumed that the busy or queue 
> full status
> > followed by an Asynch Event
> > message indicating readiness is the mechanism provided by 
> > SCSI to regulate
> > the command flow.
> > 
> > It is hard to imagine that give the variable life-time of 
> > SCSI commands and
> > the
> > opaque nature of the resources required to execute them  that 
> > the transport
> > has
> > to help in this area.
> > 
> 
> While this issue has been discussed at some length in the past, as Jim
> McGrath stated, I believe the debate ought to be reopened 
> (although we may
> end up reaching the same conclusion as before).
> 
> As Ralph Weber pointed out, the SCSI model is to discard the 
> command, return
> status, retrieve the next command in the transport pipeline 
> and continue
> processing.  The initiator is not notified when processing 
> resumes. If there
> are many commands in flight, as there could be in an IP 
> environment, and
> target resources free up in the meantime, the result is 
> commands processed
> out of order.
> 
> Historically, such a lapse in command ordering was not seen 
> as an issue for
> the following reasons:
> 
> a) Strict ordering was not required by the most commonly 
> deployed device
> types (disks and tapes). Due to the nature of disk traffic, a 
> simple retry
> mechanism was deemed sufficient to recover from these errors. 
> Since legacy
> streaming devices, such as tapes, did not support command 
> queuing, command
> ordering considerations were not a factor there either.
> 
> b) Transport delays over storage interconnects were small, so not many
> commands were apt to be in flight. i.e.. The window for such 
> errors was very
> small.
> 
> c)  The resource guarantees needed for a loss-avoidance 
> mechanism in the
> target adversely effected device cost, especially at the high-volume,
> low-end of the market.
> 
> Given the above considerations, there was little support 
> within the storage
> community for measures addressing this issue.
> 
> If we now believe that the iSCSI environment changes the 
> rules, I believe
> the interconnect protocol can provide useful assists, such as:
> 
> a)  On a command overflow condition, have the iSCSI target 
> flush the command
> pipeline by returning status and discarding all subsequently received
> commands until a host acknowledgement is received.
> 
> b)  Implement some sort of credit-based mechanism for 
> overflow-avoidance.
> 
> 
> Comments?
> 
> 
> <Stuff deleted>
> 
> > Jim McGrath <Jim.McGrath@quantum.com> on 07/09/2000 06:06:40
> > 
> > Please respond to Jim McGrath <Jim.McGrath@quantum.com>
> > 
> > To:   "'Matt Wakeley'" <matt_wakeley@agilent.com>, ips 
> > <ips@ece.cmu.edu>
> > cc:    (bcc: Julian Satran/Haifa/IBM)
> > Subject:  RE: Command Queue Depth   (was asymmetric/Symmetric)
> > 
> > 
> > 
> > 
> > 
> > The issue of buffer space allocation for multiple initiators 
> > has a long and
> > troubled history in SCSI.  We have never been able to come up 
> > with a good
> > answer.
> > 
> > Fibre Channel tried to fix this with the notion of "login BB 
> > credit" - when
> > you login you get a minimum number of credits you are always 
> > guaranteed
> > when
> > you start data transfers.  The problem with this is that 
> > storage devices
> > had
> > no realistic ability to discriminate between initiators or to 
> > change the
> > login BB credit.  In addition, the expectation is that all possible
> > initiators would get these credits on login.  So storage 
> > devices vendors
> > have played it safe and kept this number low (at 0 until 
> recently, now
> > around 2).  For iSCSI the number of initial credits you need 
> > to "prime the
> > pump" until normal data flow is established is probably large 
> > (given the
> > latencies are higher than in Fibre Channel, especially 
> FC-AL), and the
> > number of potential initiators larger than in Fibre Channel, 
> > making this a
> > whole lot worse for the storage device.
> > 
> > As soon as we allow the devices to start adjusting these 
> > credits, then you
> > have the protocol problem of making sure people know when 
> > their credits are
> > adjusted and the policy problem of how, who, and when to adjust the
> > credits.
> > Changing everyone's credit when you add a new initiator can 
> get into a
> > notification nightmare, although it is "fair."  Any policy 
> > brings up all
> > sorts of nasty issues regarding fairness vs efficient use of the
> > transmission media.
> > 
> > Jim
> > 
> > Note: the same problem has plagued other attempts to allocate device
> > resources between multiple initiators, like command queue space.  In
> > general
> > policies with respect to multiple initiators are not really 
> > standard in the
> > SCSI world.
> > 
> > 
> > -----Original Message-----
> > From: Matt Wakeley [mailto:matt_wakeley@agilent.com]
> > Sent: Wednesday, September 06, 2000 2:56 PM
> > To: ips
> > Subject: Re: Command Queue Depth (was asymmetric/Symmetric)
> > 
> > 
> > Joshua Tseng wrote:
> > 
> > > James,
> > >
> > > I agree with others that there may be an issue with the
> > > command windowing mechanism in the existing iSCSI spec.  
> It is like
> > > "TCP in reverse", in that the target determines the size of 
> > the window,
> > and
> > > not the initiator as in TCP.  Rather, I believe that 
> > everything that this
> > > windowing mechanism is attempting to achieve can be more 
> > easily obtained
> > > by having the target communicate its buffer size to the 
> initiator at
> > > iSCSI login.  It should be the role of the initiator to 
> > determine how
> > > many commands to put in flight simultaneously, given this input on
> > available
> > > buffer size from the target.
> > 
> > As more initiators connect to a target, it may need to 
> scale back the
> > amount
> > of
> > this buffering it has allocated to each previously logged in 
> > initiator (to
> > prevent rejecting new logins).
> > 
> > >
> > >
> > > As far as multiple initiators, could this not be resolved 
> > by the target
> > > refusing additional logins beyond the number of initiators 
> > it can safely
> > > support?  Not being a storage expert, this is my best 
> > guess/suggestion
> > > at how to do it.
> > 
> > I believe John already answered this...
> > 
> > -Matt
> > 
> > 
> >
> 
> Charles Monia
> Senior Technology Consultant
> Nishan Systems Corporation
> email: cmonia@nishansystems.com
> voice: (408) 519-3986
> fax:   (408) 435-8385
>  
>
Prev by Date: RE: RE: Re: multiple connections
Next by Date: RE: Command Queue Depth (was asymmetric/Symmetric)
Prev by thread: Re: Command Queue Depth (was asymmetric/Symmetric)
Next by thread: RE: Command Queue Depth (was asymmetric/Symmetric)
Index(es):
- Date
- Thread
Home
Last updated: Tue Sep 04 01:07:27 2001
6315 messages in chronological order