SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: Command Queue Depth (was asymmetric/Symmetric)



    
    Matt,
    
    I agree that the T10 has to take a role in these issues - I was just
    pointing out that they are not new issues, T10 has discussed them before in
    the context of parallel SCSI and Fibre Channel, and has never come up with a
    very good solution.  So I would advise caution on expecting iSCSI (whether
    the work is done here or in T10) to come up with a good solution anytime
    soon (basically this whole area is a "research project").
    
    On the issue of latency, the key here is that different physical transports
    and (more importantly) the configurations they enable/use have a dramatic
    impact on the absolute amount of buffer space, credits, etc  required for
    good performance.  As an example, while you are right that the elasticity
    buffer in FC-AL is a major latency issue, it is not large in absolute terms
    (today I would expect < 1 us for the sum of all the latencies for all the
    elasticity buffers in the type of system you mentioned, a 20 drive loop).
    By contrast, store and forward switches introduce at least tens of us of
    latency per switch in the path - and that is if everything is done at
    hardware speeds.  While not uncommon for layer-2 Ethernet switches, layer 3
    and layer 4 switches are still usually much slower and/or more expensive due
    to the (historical) complexity of that degree of automation for those
    protocols, and thus the reliance on very fast (and expensive) CPUs.
    
    Of greater import is that FC-AL is so limited that it really is physically
    impossible to get too high a latency.  By contrast, running TCP/IP on some
    mixture of physical plants, with potentially global transmission paths, run
    up latencies in the ms, if not seconds, range.  While this is a great
    strength of the internet, it is a real problem in this context (100 ms
    latencies imply hundreds of MB of buffering for Gbit speeds - or losing a
    lot of performance if the protocol requires a lot of turnaround delays for
    things like getting buffer credits).
    
    As usual, confining the application space an make your job a lot easier.
    But if iSCSI is targeting the general, global, high speed, TCP/IP
    environment, then you have a lot of issues.
    
    
    Jim
    
    Personally, I'd rather restrict the application space some more to make the
    job easier.
    
    
    
    -----Original Message-----
    From: Matt Wakeley [mailto:matt_wakeley@agilent.com]
    Sent: Thursday, September 07, 2000 10:25 PM
    To: Jim McGrath; ips
    Subject: Re: Command Queue Depth (was asymmetric/Symmetric)
    
    
    Jim McGrath wrote:
    
    > I agree that BB credit has nothing to do with commands per se, but
    > illustrates the problem we have had with deciding on policies for the
    > distributed allocation of device resources over multiple initiators.  In
    my
    > postscript I noted that the same problems have arisen on command queue
    (how
    > many queue slots do you get), with similarly no satisfactory solution.
    
    So it sounds to me like this distribution of (command) resources across
    initiators is a T10 SCSI issue, and should be solved there, not by each
    individual transport (yesterday FC - which didn't solve it, today iSCSI,
    tomorrow IB or whatever).
    
    You've lost me on the following two paragraphs... Ethernet doesn't require
    credits to send frames - it's a ship and pray model.  I don't know what the
    latency through ethernet switches is, but I'd hope it wasn't in the ms
    range.
    Finally, a "usefull" FC-AL will have many devices on it (say 20 drives in a
    JBOD) and the latency of the elastic store really starts adding up.  So I
    still
    contend that FC-AL is not a low latency medium.
    
    -Matt
    
    > On FC-AL, the latency to get an initial credit is typically measured in us
    > for a couple of reasons.  First, much of that logic has been automated in
    > the interface hardware (indeed, the major source of delay is typically the
    > elasticity buffer, which is not store and forward and so has a very low
    > latency compared to many switches or routers).  Second, the distances are
    > very small (e.g. hundreds of meters), so both transmission time and the
    > opportunity for intervening devices to increase latency is lower than in
    the
    > general internet world.
    >
    > So generally 2 credits (of 2Kbyte frames) is enough to cover the latency
    and
    > get you into streaming.  If the latency of the system was measured in ms,
    > then a transport on a Gibt wire would require more like 50 initial credits
    > (or more) to cover the latency.  Unless we are designing for a low latency
    > environment for the exchange of credits (like those where FC-AL are used),
    > then we probably need to allocate so much buffer space that it becomes
    > difficult to promise initial credits to a lot of potential initiators.
    >
    > Jim
    >
    > PS general Fibre Channel (e.g. with switches and the like) is a bit
    > different.
    >
    > -----Original Message-----
    > From: Matt Wakeley [mailto:matt_wakeley@agilent.com]
    > Sent: Wednesday, September 06, 2000 8:23 PM
    > To: ips
    > Subject: Re: Command Queue Depth (was asymmetric/Symmetric)
    >
    > Jim,
    >
    > I agree that FC has tried (unsuccessfully) to address this command queue
    > allocation problem.
    >
    > However, the "login bb credit" mechanism in FC does not address the
    command
    > queue depth issue at all.  BB credit is used to receive commands and/or
    > data,
    > and the target has no clue in advance what is coming.  BB credit is just
    > there
    > to ensure that there is a lowest layer buffer available to receive the FC
    > frame
    > (as opposed to dropping it on the floor if there is no "mac" buffer, like
    > ethernet does).  It does not mean the command queue has any room for the
    > frame.
    >
    > At one time, there was a big push to have "data" credits and "command"
    > credits
    > to take care of this problem, but it couldn't be made to work and be
    > "backwards
    > compatible".
    >
    > > The issue of buffer space allocation for multiple initiators has a long
    > and
    > > troubled history in SCSI.  We have never been able to come up with a
    good
    > > answer.
    > >
    > > Fibre Channel tried to fix this with the notion of "login BB credit" -
    > when
    > > you login you get a minimum number of credits you are always guaranteed
    > when
    > > you start data transfers.  The problem with this is that storage devices
    > had
    > > no realistic ability to discriminate between initiators or to change the
    > > login BB credit.  In addition, the expectation is that all possible
    > > initiators would get these credits on login.  So storage devices vendors
    > > have played it safe and kept this number low (at 0 until recently, now
    > > around 2).  For iSCSI the number of initial credits you need to "prime
    the
    > > pump" until normal data flow is established is probably large (given the
    > > latencies are higher than in Fibre Channel, especially FC-AL), and the
    >
    > How is the latency low on FC-AL?? given that you need to arbitrate and win
    > the
    > loop, then receive bb credit, before you can send anything?
    >
    > -Matt
    >
    > >
    > > number of potential initiators larger than in Fibre Channel, making this
    a
    > > whole lot worse for the storage device.
    > >
    > > As soon as we allow the devices to start adjusting these credits, then
    you
    > > have the protocol problem of making sure people know when their credits
    > are
    > > adjusted and the policy problem of how, who, and when to adjust the
    > credits.
    > > Changing everyone's credit when you add a new initiator can get into a
    > > notification nightmare, although it is "fair."  Any policy brings up all
    > > sorts of nasty issues regarding fairness vs efficient use of the
    > > transmission media.
    > >
    > > Jim
    > >
    > > Note: the same problem has plagued other attempts to allocate device
    > > resources between multiple initiators, like command queue space.  In
    > general
    > > policies with respect to multiple initiators are not really standard in
    > the
    > > SCSI world.
    > >
    > > -----Original Message-----
    > > From: Matt Wakeley [mailto:matt_wakeley@agilent.com]
    > > Sent: Wednesday, September 06, 2000 2:56 PM
    > > To: ips
    > > Subject: Re: Command Queue Depth (was asymmetric/Symmetric)
    > >
    > > Joshua Tseng wrote:
    > >
    > > > James,
    > > >
    > > > I agree with others that there may be an issue with the
    > > > command windowing mechanism in the existing iSCSI spec.  It is like
    > > > "TCP in reverse", in that the target determines the size of the
    window,
    > > and
    > > > not the initiator as in TCP.  Rather, I believe that everything that
    > this
    > > > windowing mechanism is attempting to achieve can be more easily
    obtained
    > > > by having the target communicate its buffer size to the initiator at
    > > > iSCSI login.  It should be the role of the initiator to determine how
    > > > many commands to put in flight simultaneously, given this input on
    > > available
    > > > buffer size from the target.
    > >
    > > As more initiators connect to a target, it may need to scale back the
    > amount
    > > of
    > > this buffering it has allocated to each previously logged in initiator
    (to
    > > prevent rejecting new logins).
    > >
    > > >
    > > >
    > > > As far as multiple initiators, could this not be resolved by the
    target
    > > > refusing additional logins beyond the number of initiators it can
    safely
    > > > support?  Not being a storage expert, this is my best guess/suggestion
    > > > at how to do it.
    > >
    > > I believe John already answered this...
    > >
    > > -Matt
    


Home

Last updated: Tue Sep 04 01:07:25 2001
6315 messages in chronological order