RE: Avoiding deadlock in iSCSI

To: "'csapuntz@cisco.com'" <csapuntz@cisco.com>, ips@ece.cmu.edu
Subject: RE: Avoiding deadlock in iSCSI
From: Jim McGrath <Jim.McGrath@quantum.com>
Date: Mon, 11 Sep 2000 18:37:37 -0700
Content-Type: text/plain;charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu


Note that SCSI targets, when faced with getting a command queue full, do not
stop reading from the interconnect.  If more commands are received then they
respond with a QUEUE FULL status.  If data is received then they receive
that data without regard for the status of the command queue (as long as it
is data for an already queued command).  This eliminates the potential for
deadlock between command and data queues.

Potentially you can have a situation where multiple commands already at the
target have only part of their data transmitted, with the remainder still at
the initiator(s), and then run out of buffer space for data.  If the target
uses a credit model to pace the reception of data, it can also make sure
this never happens.  Unsolicited data, even for commands already queued, can
end up creating this deadlock - which is why unsolicited data systems either
have to have a tight limit on the resources it can use (e.g. low login BB
credit in Fibre Channel terms) or some sort of clean (i.e. not IO
terminating) rejection mechanism from target to initiator (like in USB).

If data is not received by the target, then it thinks the initiator has more
credits than the initiator thinks it does.  If the data was delayed in
transit, then the target still thinks the credit is outstanding to the
initiator, and so should not reuse it yet for that or any other initiator.
You may get a pause in the data transfer, but you should not get a deadlock.
If the data was lost then some sort of timeout catches the problem.  

In sum, if existing SCSI rules are used there should not be a problem except
for receiving data before its command or getting delayed data.  The target
can distinguish between the two cases.  In the first case you probably
should drop the data (but notify the initiator so recovery can be done). In
the second, you just wait until the data catches up - as long as the delay
is small, things are manageable.  Otherwise drop the data and notify the
initiator (i.e. handle just like the case of data before a command).

Obviously the initiator should not try and generate these problems for the
target, but as someone pointed out, the target has to have some defined
behavior if the initiator (or fabric) introduces the problem.

Jim



-----Original Message-----
From: csapuntz@cisco.com [mailto:csapuntz@cisco.com]
Sent: Monday, September 11, 2000 3:03 PM
To: ips@ece.cmu.edu
Cc: csapuntz@cisco.com
Subject: Avoiding deadlock in iSCSI



The problem:

iSCSI, as currently spec'ed, allows SCSI commands and data to be
interleaved fairly freely on a TCP connection. A target that stops
reading from a TCP connection to avoid reading more command packets
also prevents itself from reading data packets.  Those data packets
may be criticial to making progress on the currently executing
command.

Note the issue appears with one TCP connection for control and data
and even appears in many of the multiple connection schemes.

Data in iSCSI comes in two forms:

	1) solicited - data requested by target via RTT 
	             - data requested by initiator via a SCSI command
	2) unsolicited - data sent by initiator without having received an
RTT

The analysis below assumes that unsolicited data travels over the same
TCP connection as SCSI commands. Otherwise, you run the risk of receiving
unsolicited data before the relevant SCSI command (thus making
implementations more complex).

Four solutions:

1) Don't overflow the command queue (i.e. use credits)
	- and what do you do if a misbehaving initiator overflows
        your command queue anyway? Drop the connection?
	
	- requires you to reserve resources per initiator. some people
        may want to overcommit

2) Allow dropping of SCSI commands when queue fills
	- how do you clean up after a dropped SCSI command?
	    - there may be other commands in the pipeline
	
	One approach: On command drop, the target enters an error
	state. While in the error state, all newly received commands
	terminate with an error until the initiator explicitly clears
	the error state using a "clear error state" message.

	You might think that TASK SET FULL and ACA mechanisms from SCSI
        could be used to attack this problem. However, TASK SET FULL errors
	don't trigger ACA (in my reading of the SAM). Also, ACA is only
	triggered by the current enabled command, not by random commands
	entered into the task set.

3) Put solicited data on a dedicated TCP connection. Require that
unsolicited data MUST follow the command, ideally in the same iSCSI
PDU

4) (Do it like NFS) Make all transfers from initiator to target
unsolicited. Make sure unsolicited data follows the command
immediately.
   

Of all the options, #1 and #4 sound the easiest to implement. #2 is more
sophisticated than #1. #3 is just plain clever but that's rarely a good
thing. :)  #4 has large ramifications on current SCSI target designs.

-Costa

Follow-Ups:
- Re: Avoiding deadlock in iSCSI
  - From: csapuntz@cisco.com

Prev by Date: RE: Avoiding deadlock in iSCSI
Next by Date: RE: Avoiding deadlock in iSCSI
Prev by thread: RE: Avoiding deadlock in iSCSI
Next by thread: Re: Avoiding deadlock in iSCSI
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:07:22 2001
6315 messages in chronological order