RE: RE: RE: Re: multiple connections

To: ips@ece.cmu.edu
Subject: RE: RE: RE: Re: multiple connections
From: somesh_gupta@hp.com
Date: Fri, 8 Sep 2000 13:22:02 -0700
Content-Disposition: inline; filename="BDY.TXT"
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII; name="BDY.TXT"
Sender: owner-ips@ece.cmu.edu

I aplogise in advance for being confused about direction of
flow implied in parts of the message, so please help clarify it
so I can understand better.

> -----Original Message-----
> From: rsnively@Brocade.COM [mailto:rsnively@Brocade.COM]
> Sent: Friday, September 08, 2000 8:07 AM
> To: somesh_gupta@hp.com; ips@ece.cmu.edu; rsnively@Brocade.COM;
> matt_wakeley@agilent.com
> Subject: FW: RE: RE: Re: multiple connections
> 
> 
> 
> Something is missing here.  A window based outbound flow control is
> useless to a RAID.  The problem is that the data buffer 
> structure is filled
> from two sides.  Information is flowing outbound for write operations.

-- Does outbound here means outbound from initiator to target (write
   seems to imply that)? This would then be controlled by the window
   size indicated by the target to the initiator.

> Information is flowing into the data buffer in the inbound direction
> as a result of receiving and executing the small command packets.

-- Here is where I start to get confused. Does inbound imply from the
   target to the initiator (as opposed to outbound?). However you
   say "receiving and executing the small command packets" - seems
   to imply from initiator to target.

> As a result, the available buffering for outbound operations may be
> zero at any time in a manner scheduled by the target, not the host.
> In addition, during the maintenance of a RAID device's internal
> redundancy, additional buffer space is used up that is unknown to
> both the read and write operations.
> 
> The only ways I see to avoid a buffer resource deadlock are:
> 
> 	1)  Send a transfer ready indication for each outbound 
> transfer.  
> 		(This is the universal SCSI solution).
> 
> 	2)  Make the rather ridiculous and unenforceable rule that 
> 		the sum of all the SCSI buffer areas
> 		for all initiators be less than the buffer space
> 		of each logical unit in a RAID.
> 
> Note that communication with each initiator about the available target
> buffering may also not be sufficient, since any single initiator could
> use it all up in an instant.

-- I think (discussed in one of the other strings), an indication of
   command queue depth to manage buffers used up by commands, window
   based flow control, and recovery in the worst case when the buffers
   overflow (requires command numbering etc) should provide the
   protection. The performance improvements provided by sending write data
   without RTT (especially for small writes), and reasonable streaming
   provided by queue depth indications and windows should provide for
   very good performance. 

-- If might be worthwhile to have a analysis based on some "real"
   numbers to see how much memory would be needed and what that
   would cost etc, and the kind of window size/queue depths to
   be communicated - and the data rate they would sustain. If a
   couple of people are willing to work with me offline, I
   would be game.

> 
> >  > >  
> >  > >  2. WRITES - This is the really bad one in my opinion. For 
> >  > >  me, avoiding
> >  > >  RTTs in iSCSI would just by itself make iSCSI a superior 
> >  > "transport"
> >  > >  for SCSI. So assuming RTTs are not being used, the host 
> >  > >  would (changing
> >  > >  the posting order from READs), first post the write 
> >  command to the
> >  > >  connection on which the data is to be sent, and then 
> >  post the write
> >  > >  buffers to the connection which is to be used for 
> sending data.
> >  > >  One case is where for whatever reason, the target gets the 
> >  > >  data before
> >  > >  it gets the command, and has no clue what to do with the data.
> >  > >  Let us assume that the target does get the command before 
> >  > it gets the
> >  > >  data. The target gets a command indicating that data being 
> >  > >  written to 
> >  > >  whatever lun and whatever location is going to arrive 
> >  on some other
> >  > >  connection. First the target has taken an extra event from 
> >  > >  the adapter.
> >  > >  Then, when data arrives on another NIC (and the target 
> >  gets another
> >  > >  event), the target goes through the list of outstanding WRITE
> >  > >  commands to match the command with the data and then go 
> >  about the
> >  > >  business of processing the data.
> >  > >  
> >  > >  So in this case the work has increased for the 
> >  initiator as well as
> >  > >  the target.
> >  > 
> >  > 
> >  > And to top it off, the target is active with a large number of
> >  > initiators on behalf of a large number of logical units 
> for a large
> >  > number of queued commands, so there is absolutely no 
> guarantee that
> >  > any buffer exists for the data that was received, even if the
> >  > command had been received first.
> >  > 
> >  > Every SCSI command execution is managed by the target for 
> >  > this reason among
> >  > many others.  Thus RTT has been a part of every SCSI protocol 
> >  > for write
> >  > operations.
> >  > 
> >  
> >  TCP window size should provide a reasonable flow control mechanism,
> >  so an additional flow control should not be needed. There will be
> >  some statistical determination of the amount of memory needed in
> >  a large array vs the window size extended on each connection.
> >  
> >  I don't know if arrays like to keep commands in seperate memory
> >  from data, in which case a command queue depth may have to be 
> >  communicated seperately (assuming most of window would typically
> >  be used for data)
> >  
> >  
>

Follow-Ups:
- Enhancements for the iSCSI
  - From: "Nelson Nahum" <nnahum@store-age.com>

Prev by Date: RE: a vote for asymmetric connections in a session
Next by Date: RE: Data in SCSI Response or SCSI Data
Prev by thread: RE: RE: Re: multiple connections
Next by thread: Enhancements for the iSCSI
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:07:27 2001
6315 messages in chronological order