RE: Avoiding deadlock in iSCSI

To: "David Robinson" <David.Robinson@EBay.Sun.COM>, <ips@ece.cmu.edu>
Subject: RE: Avoiding deadlock in iSCSI
From: "Douglas Otis" <dotis@sanlight.net>
Date: Tue, 12 Sep 2000 10:27:03 -0700
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;charset="iso-8859-1"
Importance: Normal
In-Reply-To: <200009120152.SAA18183@ha10nwk.EBay.Sun.COM>
Sender: owner-ips@ece.cmu.edu

David,

> -----Original Message-----
> From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
> David Robinson
> Sent: Monday, September 11, 2000 6:52 PM
> To: ips@ece.cmu.edu
> Subject: RE: Avoiding deadlock in iSCSI
>
>
> Thanks for your clarifying comments.
>
> > > In general
> > > I consider that to be a bug and the receiver should just drop the
> > > data on the floor.
> >
> > Not as I understand solicit.
>
> This I don't understand, it you get data that no command has been
> sent yet, as the reciever you can either wait and hope the
> command arrives consuming buffer space but eventually dropping
> it after some timeout, or drop it on the spot. In either case this
> seems like a bad design we should try to avoid.

As you can not predict latency on two TCP connections, there would be no
assurance of delivery order.  A skew buffer would seem a requirement.

> > > My first assumption is that the sender would not send commands
> > > C1 and C2 and data D2 and D1 on the same connection. Doing that
> > > creates nasty ordering problems we want to avoid.
> >
> > Order on the wire can not be controlled. Only the ULP can avoid such.
>
> Yes it can, this is exactly the advantage of using a reliable stream
> protocol like TCP, the session layer never sees out of order packets.
> With multiple data connections and the appropriate ordering constraints
> we have no deadlock or buffer management issues.

Each TCP stream can deliver sequential data, but not within multiple
streams.  As such, sequential delivery goes out the window with respect to
aggregation.  As far as the wire, TCP is part of the Upper Layer Protocol.

> > Resources are held until associated data is received to
> complete operations.
> > If the resource limit is not the data buffer nor freed by
> content already
> > within the data buffer, this will result in discarding commands.
>
> But with a reliable stream no commands need to be discarded, the
> transport flow controls so the commands are held at the sender.

The balance between commands and data are not within the control of the
target via flow-control.  As such, either a data or command resource may
become exhausted.  At some point, either data or commands may be stopped
without necessarily stopping TCP.  The means for stopping a command is Check
Condition, and for data, discarding.

>
> > > With multiple data connections, some may flow
> > > control but the active command will be able to make progress on
> > > one connection. This may not be the most efficient mechanism but
> > > it is "safe".
> >
> > One connection per LUN or one connection per command, safe but
> expensive?
>
> Define "expensive", not in terms of performance as one TCP
> connection can saturate the link layer or not in terms of memory
> as the mux/demux state has to be held either in the transport or
> the session layer.  I am not advocating a connection per command
> as that is just a bad datagram protocol, but either a connection per
> LUN or per target should work just fine.

Should there be a TCP connection per LUN, the number of TCP connections
would be large if a controller is sitting on 48 LUNs.  With asymmetrical
connections that would imply 96 TCP sessions per client possible plus fail
over connections.  On the network, TCP shares on a session basis.  This
would mean a device with a single connection on the same network would then
enjoy only 1% of the bandwidth.  This says nothing about TCP overhead.

> > As a means for freeing resources, data is to be discarded
> within the iSCSI
> > architecture.  As such, even unsolicited data may be requested by the
> > target.
>
> I don't understand this statement. Short of target errors or connection
> errors, why does data need to be discarded? The sender should never
> send data without a command, and on a given connection the data MUST
> always be sent after the corresponding command and data from two
> commands must always be sent in order if on the same connection.

The iSCSI means of limiting the amount of data presented in an unsolicited
fashion is to discard.  For a given amount of buffer space, the number of
commands associated with this space is unknown.  The overhead for staging
these commands would add an additional overhead on a command basis not a
data basis.  As such, it would be like adding water to a box of rice. Within
a margin that stays out of trouble, how much of the buffer would you be
wasting to handle all situations?  What if you oops due to the latency of
responding?

Doug

References:
- RE: Avoiding deadlock in iSCSI
  - From: David Robinson <David.Robinson@EBay.Sun.COM>

Prev by Date: RE: a vote for asymmetric connections in a session
Next by Date: RE: Avoiding deadlock in iSCSI
Prev by thread: RE: Avoiding deadlock in iSCSI
Next by thread: RE: Avoiding deadlock in iSCSI
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:07:21 2001
6315 messages in chronological order