SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: Command Queue Depth (was asymmetric/Symmetric)



    
        Hi Matt
    
        regarding Ethernet Switches, i can tell you from my experience in architecting
    wide range of those: For a Gigabit switch, the latency of a 2KByte packet is 16us
    in today arch and going below 10us in next generation.
    
        note the following:
    
    - the numbers given there is from last-bit in to first-bit out, so, it is different
    than the "normal" latency which is first-bit in to first-bit out
    
    - whether the switch is acting on Layer-2 or Layer-3 does not matter since all
    switching decision done on the fly
    
    - Ethernet switch latency depends on packet length due to store and forward
    behaviour.
    
    - In Our devices,  priority is given based on the TCP/UDP connection, so iSCSI
    connection always get higher priority than other data ( FTP, HTTP ), and thus,
    experience the lowest latency possible ( 16us for 2KByte )
    
        -Nafea
    
    Matt Wakeley wrote:
    
    > Jim McGrath wrote:
    >
    > > I agree that BB credit has nothing to do with commands per se, but
    > > illustrates the problem we have had with deciding on policies for the
    > > distributed allocation of device resources over multiple initiators.  In my
    > > postscript I noted that the same problems have arisen on command queue (how
    > > many queue slots do you get), with similarly no satisfactory solution.
    >
    > So it sounds to me like this distribution of (command) resources across
    > initiators is a T10 SCSI issue, and should be solved there, not by each
    > individual transport (yesterday FC - which didn't solve it, today iSCSI,
    > tomorrow IB or whatever).
    >
    > You've lost me on the following two paragraphs... Ethernet doesn't require
    > credits to send frames - it's a ship and pray model.  I don't know what the
    > latency through ethernet switches is, but I'd hope it wasn't in the ms range.
    > Finally, a "usefull" FC-AL will have many devices on it (say 20 drives in a
    > JBOD) and the latency of the elastic store really starts adding up.  So I still
    > contend that FC-AL is not a low latency medium.
    >
    > -Matt
    >
    > > On FC-AL, the latency to get an initial credit is typically measured in us
    > > for a couple of reasons.  First, much of that logic has been automated in
    > > the interface hardware (indeed, the major source of delay is typically the
    > > elasticity buffer, which is not store and forward and so has a very low
    > > latency compared to many switches or routers).  Second, the distances are
    > > very small (e.g. hundreds of meters), so both transmission time and the
    > > opportunity for intervening devices to increase latency is lower than in the
    > > general internet world.
    > >
    > > So generally 2 credits (of 2Kbyte frames) is enough to cover the latency and
    > > get you into streaming.  If the latency of the system was measured in ms,
    > > then a transport on a Gibt wire would require more like 50 initial credits
    > > (or more) to cover the latency.  Unless we are designing for a low latency
    > > environment for the exchange of credits (like those where FC-AL are used),
    > > then we probably need to allocate so much buffer space that it becomes
    > > difficult to promise initial credits to a lot of potential initiators.
    > >
    > > Jim
    > >
    > > PS general Fibre Channel (e.g. with switches and the like) is a bit
    > > different.
    > >
    > > -----Original Message-----
    > > From: Matt Wakeley [mailto:matt_wakeley@agilent.com]
    > > Sent: Wednesday, September 06, 2000 8:23 PM
    > > To: ips
    > > Subject: Re: Command Queue Depth (was asymmetric/Symmetric)
    > >
    > > Jim,
    > >
    > > I agree that FC has tried (unsuccessfully) to address this command queue
    > > allocation problem.
    > >
    > > However, the "login bb credit" mechanism in FC does not address the command
    > > queue depth issue at all.  BB credit is used to receive commands and/or
    > > data,
    > > and the target has no clue in advance what is coming.  BB credit is just
    > > there
    > > to ensure that there is a lowest layer buffer available to receive the FC
    > > frame
    > > (as opposed to dropping it on the floor if there is no "mac" buffer, like
    > > ethernet does).  It does not mean the command queue has any room for the
    > > frame.
    > >
    > > At one time, there was a big push to have "data" credits and "command"
    > > credits
    > > to take care of this problem, but it couldn't be made to work and be
    > > "backwards
    > > compatible".
    > >
    > > > The issue of buffer space allocation for multiple initiators has a long
    > > and
    > > > troubled history in SCSI.  We have never been able to come up with a good
    > > > answer.
    > > >
    > > > Fibre Channel tried to fix this with the notion of "login BB credit" -
    > > when
    > > > you login you get a minimum number of credits you are always guaranteed
    > > when
    > > > you start data transfers.  The problem with this is that storage devices
    > > had
    > > > no realistic ability to discriminate between initiators or to change the
    > > > login BB credit.  In addition, the expectation is that all possible
    > > > initiators would get these credits on login.  So storage devices vendors
    > > > have played it safe and kept this number low (at 0 until recently, now
    > > > around 2).  For iSCSI the number of initial credits you need to "prime the
    > > > pump" until normal data flow is established is probably large (given the
    > > > latencies are higher than in Fibre Channel, especially FC-AL), and the
    > >
    > > How is the latency low on FC-AL?? given that you need to arbitrate and win
    > > the
    > > loop, then receive bb credit, before you can send anything?
    > >
    > > -Matt
    > >
    > > >
    > > > number of potential initiators larger than in Fibre Channel, making this a
    > > > whole lot worse for the storage device.
    > > >
    > > > As soon as we allow the devices to start adjusting these credits, then you
    > > > have the protocol problem of making sure people know when their credits
    > > are
    > > > adjusted and the policy problem of how, who, and when to adjust the
    > > credits.
    > > > Changing everyone's credit when you add a new initiator can get into a
    > > > notification nightmare, although it is "fair."  Any policy brings up all
    > > > sorts of nasty issues regarding fairness vs efficient use of the
    > > > transmission media.
    > > >
    > > > Jim
    > > >
    > > > Note: the same problem has plagued other attempts to allocate device
    > > > resources between multiple initiators, like command queue space.  In
    > > general
    > > > policies with respect to multiple initiators are not really standard in
    > > the
    > > > SCSI world.
    > > >
    > > > -----Original Message-----
    > > > From: Matt Wakeley [mailto:matt_wakeley@agilent.com]
    > > > Sent: Wednesday, September 06, 2000 2:56 PM
    > > > To: ips
    > > > Subject: Re: Command Queue Depth (was asymmetric/Symmetric)
    > > >
    > > > Joshua Tseng wrote:
    > > >
    > > > > James,
    > > > >
    > > > > I agree with others that there may be an issue with the
    > > > > command windowing mechanism in the existing iSCSI spec.  It is like
    > > > > "TCP in reverse", in that the target determines the size of the window,
    > > > and
    > > > > not the initiator as in TCP.  Rather, I believe that everything that
    > > this
    > > > > windowing mechanism is attempting to achieve can be more easily obtained
    > > > > by having the target communicate its buffer size to the initiator at
    > > > > iSCSI login.  It should be the role of the initiator to determine how
    > > > > many commands to put in flight simultaneously, given this input on
    > > > available
    > > > > buffer size from the target.
    > > >
    > > > As more initiators connect to a target, it may need to scale back the
    > > amount
    > > > of
    > > > this buffering it has allocated to each previously logged in initiator (to
    > > > prevent rejecting new logins).
    > > >
    > > > >
    > > > >
    > > > > As far as multiple initiators, could this not be resolved by the target
    > > > > refusing additional logins beyond the number of initiators it can safely
    > > > > support?  Not being a storage expert, this is my best guess/suggestion
    > > > > at how to do it.
    > > >
    > > > I believe John already answered this...
    > > >
    > > > -Matt
    
    --
    
                 Nafea
                    \\|//
                    (o o)
    ~~~~~~~~~~~~oOOo~(_)~oOOo~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
         _/_/_/_/
       _/_/_/_/_/_/
     _/_/_/_/_/_/_/_/
     _/_/_/_/_/_/_/_/
    _/_/_/_/            Nafea Bishara
    _/_/_/              Director, Product definition
    _/_/_/    _/_/_/_/  Galileo Technology Ltd.
    _/_/_/      _/_/_/  Email     -  nafea@galileo.co.il
     _/_/_/     _/_/_/  Snail Mail-  D.N. Misgav 20184, Moshav Manof, ISRAEL.
     _/_/_/_/_/_/_/_/   Tel       -  Manof + 972 4 9999555 ext. 0
       _/_/_/_/_/_/                  Haifa + 972 4 8225046 ext. 417
         _/_/_/_/                    Mobile+ 972 54 995417
                        FAX       -  + 972 4 8326420
    
     Check our Web site: http://www.galileoT.com
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    
    


Home

Last updated: Tue Sep 04 01:07:22 2001
6315 messages in chronological order