|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: iSCSI: Out of order commands
Rod,
I stated repeatedly that this causes deadlock and adds nothing in terms of
performance.
It is explicitly stated for unsolicited data and it was just an overlook
that it was never made explicit for
commands.
Julo
"Rod Harrison" <rod.harrison@windriver.com>
07-11-01 06:16
Please respond to "Rod Harrison"
To: "Santosh Rao" <santoshr@cup.hp.com>, <cbm@rose.hp.com>
cc: Julian Satran/Haifa/IBM@IBMIL, <ips@ece.cmu.edu>
Subject: RE: iSCSI: Out of order commands
It seems to me that if a target offers a command window
greater than
one it is buying into the complexity associated with supporting that
window.
There is very little difference at the target between
buffering a
command that arrives out of order and a buffering a write command that
arrives without all the payload. In both cases other commands may
arrive which cannot be committed to the SCSI layer. If a target
doesn't want to be in the business of command queuing it has the
option of offering a command window of one.
Implementing a command queue is a much simpler
proposition at the
target than at the initiator. The target is in complete control of the
command window and can therefore simply use a static array to hold
command descriptors. The initiator has no such luxury since it has no
a priori knowledge of the command window size, indeed the command
window size can change dynamically. For the initiator to support
ordered command queuing it must use an ordered list which can be
expensive, especially when we consider the CPU power that will
typically be available to an iSCSI HBA. Negotiating the command window
size as part of login would make this more palatable for an initiator,
but I suspect targets wouldn't want to commit to an unchanging command
window.
We've been debating the merits of an initiator sending
out of order
commands which is perhaps beyond the scope of where we should be. The
cost to a target implementer is negligible and there is a potential
benefit to an initiator implementer, so why should we prohibit this
behaviour?
- Rod
-----Original Message-----
From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
Santosh Rao
Sent: Tuesday, November 06, 2001 6:16 PM
To: cbm@rose.hp.com
Cc: Rod Harrison; Julian Satran; ips@ece.cmu.edu
Subject: Re: iSCSI: Out of order commands
Mallikarjun,
Some comments below.
Regards,
Santosh
"Mallikarjun C." wrote:
>
> Rod and Julian,
>
> This has been an interesting thread of discussion. Some
> comments -
>
> 1.My first reaction was - allowing out-of-order command
> transmission on the same connection deprives targets of
> an implementation choice. Targets which support only
> single-connection sessions and only support session
> recovery (reasonable assumptions in my mind) can no
> longer afford *not to* implement a command scoreboard.
Even a single connection target *MUST* implement a scoreboard. The
reason being that it can see out-of-order arrival of commands due to
commands being dropped on digest errors. In such a case, it must block
further command processing until holes are filled.
Thus, there is no getting away from implementing a sequencer at the
target. Given this, I think it is unreasonable to restrict initiator
implementation flexibility by imposing a strict ordering requirement
within the connection.
> 2.Any end-node efficiency that is sought to be achieved
> by transmitting CmdSNs out-of-order from the initiator
> would be lost on the other end-node, since the target
> now must wait for re-ordering the commands.
It has to handle this situation anyway to deal with holes caused by
digest errors. This scenario occurs even with initiators that issue
commands in order.
>
> 3.The flipside is that out-of-order transmission saves
> link badwidth (albeit at the expense of end-node efficiency),
> compared to idling the link waiting for outbound DMA.
> We have to determine if this is a reasonable trade-off.
>
> 4.I can see Rod's point that prefetching all immediate
> data can be a burden on the NIC resources. But, two
> questions -
> - could the NIC not use unsolicited separate data
> PDUs in these cases? [ I realize that InitialR2T
> has to be "no" to let it happen... ]
> - could the NIC have a memory architecture that
> allows data prefetching for the next command (so
> this is a non-issue from the protocol perspective)?
> This scheme incurs one DMA delay for every new
> burst of commands.
>
> 5.Another (perhaps radical at this point) option is to do
> away with immediate unsolicited data, to stick only with
> separate unsolicited data. I would personally be okay
> with the choice, particularly if this feature (that
> helps software implementations) starts making hardware
> design complicated/expensive.
>
> So, to summarize -
>
> option immediate allow
> data in spec? out-of-order?
>
> (A) (5) above no no
> (B) No real reason to do this. no yes
> (C) (4) above yes no
> (D) pros & cons (1), (2) & (3) yes yes
>
> >From the arguments I heard so far, I am leaning towards
> option A, and option C in that order.
>
> Comments?
> --
> Mallikarjun
>
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668 Hewlett-Packard, Roseville.
> cbm@rose.hp.com
>
> Rod Harrison wrote:
> >
> > Julian,
> >
> > I don't understand what you are proposing here, what do
you mean by
> > "multiplexed" DMA?
> >
> > The problem is that the DMAs take some time, the more
there are
> > queued the longer the last DMAs queued take to complete. Some
commands
> > require DMAs to complete before they can be sent, i.e. Writes with
> > immediate data, some commands do not, i.e. Reads and writes with
no
> > immediate data. The iSCSI HBA wants to be able to send commands as
> > soon a possible, which for a read after a write can be before the
> > write's DMA has completed. Maintaining an ordered queue for
commands
> > to be sent on the HBA is expensive and redundant since the target
> > already knows how to queue commands before committing them to its
SCSI
> > layer.
> >
> > The iSCSI HBA and its host driver are not at liberty to
change the
> > order of commands from the OS, but the DMAs those commands need
are
> > unlikely to complete in the same order, and as I mentioned some
> > commands need no DMA. If the HBA can't send commands out of CmdSN
> > order it has to maintain an ordered queue of commands waiting to
be
> > sent, and potentially buffer a lot of data. For an HBA this makes
> > immediate data almost impossible to support.
> >
> > I don't see the problem with allowing out of order
commands given
> > that the target already has to deal with very similar problems. I
> > think we are getting in to the area of implementation choices
here,
> > which is inappropriate for a specification.
> >
> > - Rod
> >
> > -----Original Message-----
> > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
Behalf Of
> > Julian Satran
> > Sent: Monday, November 05, 2001 10:06 PM
> > To: ips@ece.cmu.edu
> > Subject: Re: iSCSI: Out of order commands, was current UNH
Plugfest
> >
> > Rod,
> >
> > I don't see any reason why DMA operations cant be "multiplexed"
with
> > commands.
> > If you have scheduled a long outbound DMA you are doomed
regardless of
> > the
> > command ordering.
> > And if you have scheduled DMA operations piecemeal then you can
insert
> > your commands in correct order.
> >
> > Julo
> >
> > "Rod Harrison" <rod.harrison@windriver.com>
> > 05-11-01 20:48
> > Please respond to "Rod Harrison"
> >
> > To: Julian Satran/Haifa/IBM@IBMIL, <ips@ece.cmu.edu>
> > cc:
> > Subject: iSCSI: Out of order commands, was current
UNH
> > Plugfest
> >
> > [ Subject changed ]
> >
> > Julian,
> >
> > The ordering difference is introduced between the
> > host
> > side driver
> > and the iSCSI HBA. The host side driver must present SCSI commands
to
> > the HBA in the order they are received from the OS to prevent read
> > after write dependency failures. The HBA might reorder the
commands
> > depending on when DMA completes. The reordering can't be done
ahead of
> > time in the host driver since it doesn't know how long each DMA
might
> > take. As long as the HBA assigns CmdSN in the order it receives
> > commands the desired host ordering is preserved.
> >
> > - Rod
> >
> > -----Original Message-----
> > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
Behalf Of
> > Julian Satran
> > Sent: Monday, November 05, 2001 12:35 AM
> > To: ips@ece.cmu.edu
> > Subject: RE: iSCSI: current UNH Plugfest
> >
> > Rod,
> >
> > I all examples give the point I find hard to understand is why is
the
> > ordering on the wire different from the presentation order to the
> > initiator. You can get as many overlaps as you want by presenting
the
> > commands to the initiator in the desired order.
> > What we are considering here is the case in which you want to ship
in
> > an
> > order different than the one you present the commands.
> >
> > Julo
> >
> > "Rod Harrison" <rod.harrison@windriver.com>
> > Sent by: owner-ips@ece.cmu.edu
> > 04-11-01 04:42
> > Please respond to "Rod Harrison"
> >
> > To: "Barry Reinhold" <bbrtrebia@mediaone.net>, "Dave
> > Sheehy"
> > <dbs@acropora.rose.agilent.com>, "IETF IP SAN Reflector"
> > <ips@ece.cmu.edu>
> > cc:
> > Subject: RE: iSCSI: current UNH Plugfest
> >
> > Barry,
> >
> > In general I agree but I don't think this is as
much
> > of a
> > corner case
> > as it at first appears. Targets will have code very similar to
that
> > needed to handle out of order commands to deal with digest errors.
> > Targets also need to queue commands whilst waiting for both
solicited
> > and unsolicited data to arrive. Queuing out of order commands
seems
> > little extra work.
> >
> > From an initiators point of view there are
> > efficiency,
> > and probably
> > performance gains to be had from sending commands out of order.
Bob
> > Russell gave the example of a read being sent whilst write data
DMA is
> > happening, and a similar situation can arise with DMA for writes
> > overtaking that of earlier writes if the initiator has multiple
DMA
> > engines. In this case the initiator might be forced to let the
wire go
> > idle if it can't send the data from completed DMAs as soon as
> > possible.
> >
> > We already have a command queue at the target to
> > enforce
> > correct
> > serialisation of commands, doing the same thing at the initiator
is
> > redundant.
> >
> > Finally, I don't believe we should be writing a
> > standard
> > to work
> > around poor coding and test coverage, especially at the cost of
> > potential efficiency gains.
> >
> > I agree with Dave and Santosh that commands being
> > sent
> > out of order
> > on a single session should be allowed by the standard.
> >
> > - Rod
> >
> > -----Original Message-----
> > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
Behalf Of
> > Barry Reinhold
> > Sent: Friday, November 02, 2001 5:24 PM
> > To: Dave Sheehy; IETF IP SAN Reflector
> > Subject: RE: iSCSI: current UNH Plugfest
> >
> > Using features such as out of order command delivery on a
connection
> > tend to
> > be the sort of things that lead to interoperability problems. It
is
> > unexpected and probably going to hit poorly tested code paths even
if
> > the
> > standard is written to allow it.
> >
> > >-----Original Message-----
> > >From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On
Behalf
> > Of
> > >Dave Sheehy
> > >Sent: Friday, November 02, 2001 4:19 PM
> > >To: IETF IP SAN Reflector
> > >Subject: Re: iSCSI: current UNH Plugfest
> > >
> > >
> > >
> > >> 3. Can commands be sent out of order on the same connection?
> > >>
> > >> The behavior of targets is clearly specified in Section
2.2.2.3
> > on
> > >> page 25 of draft 8, which says:
> > >> "Except for the commands marked for immediate delivery the
> > iSCSI
> > >> target layer MUST eliver the commands for execution in the
> > order
> > >> specified by CmdSN."
> > >>
> > >> Section 2.2.2.3 on page 26 of draft 8 also says:
> > >> "- CmdSN - the current command Sequence Number advanced by
1
> > on
> > >> each command shipped except for commands marked for
immediate
> > >> delivery."
> > >> but the meaning of the term "shipped" is vague, and does not
> > >> necessarily
> > >> require that the PDUs arrive on the other end of a TCP
> > connection
> > >> in the same order that the CmdSN values were assigned to
these
> > PDUs.
> > >>
> > >> Some initiators have been designed to send commands out of
CmdSN
> > >> order on one connection. Consider the situation where there
is
> > only
> > >> one connection and a high-level dispatcher creates a PDU for
a
> > SCSI
> > >> command that involves writing immediate data to the target.
> > This PDU
> > >> is enqueued to a lower-level layer which has to setup,
start,
> > and
> > >> wait-for a DMA operation to move the immediate data into an
> > onboard
> > >> buffer before the PDU can be put onto the wire. While this
is
> > >> happening, the dispatcher creates another unrelated PDU for
a
> > SCSI
> > >> read command (for example), and when this PDU is passed to
the
> > >> lower-level layer it can be sent immediately, ahead of the
> > previous
> > >> write PDU and therefore out of order on this connection.
> > >>
> > >> The standard clearly allows this to happen if the two PDUs
were
> > sent
> > >> on different connections, and seems to imply that this can
also
> > happen
> > >> when the two PDUs are sent on the same connection.
> > >>
> > >> The suggestion is to put in the standard an explicit
statement
> > that
> > >> this is allowed or not allowed, as appropriate.
> > >>
> > >> If this is allowed, such a statement would avoid the
erroneous
> > >> assumption being made by some target implementers that
within a
> > single
> > >> connection, commands will arrive in order.
> > >>
> > >> If this is not allowed, such a statement would avoid the
> > erroneous
> > >> assumption being made by some initiator implementers that
within
> > a
> > >> single connection, commands can be put on the wire out of
order.
> > >>
> > >> +++
> > >>
> > >> will add an explicit statement saying that this behaviour is
> > forbidden.
> > >> 2.2.2.1 will contain:
> > >>
> > >> On any given connection, the iSCSI initiator MUST send the
> > >commands in the
> > >> order specified by CmdSN.
> > >>
> > >> +++
> > >
> > >Why do you feel this behavior should be forbidden? Targets
already
> > have to
> > >order commands across the session. I don't see why it's a problem
to
> > extend
> > >that to the connection as well. I, for one, believe we should
take
> > >a liberal
> > >stance on this.
> > >
> > >Dave Sheehy
> > >
--
##################################
Santosh Rao
Software Design Engineer,
HP-UX iSCSI Driver Team,
Hewlett Packard, Cupertino.
email : santoshr@cup.hp.com
Phone : 408-447-3751
##################################
Home Last updated: Sat Nov 10 04:17:50 2001 7735 messages in chronological order |