RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"

To: "'Michael Krause'" <krause@cup.hp.com>, Stephen Bailey <steph@cs.uchicago.edu>
Subject: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
From: Robert Snively <rsnively@Brocade.COM>
Date: Fri, 6 Apr 2001 13:17:13 -0700
Cc: ips@ece.cmu.edu
Sender: owner-ips@ece.cmu.edu

Michael,

Let me explain why I feel that the iSCSI environment is
different than the college-student music-downloading
environment studied by Stone and Partridge.

All iSCSI targets are new designs.  They do not ride on
obsolete consumer TCP/IP stacks, but rather on proprietary
TCP/IP stacks, hardware assisted TCP/IP stacks, and 
robust embedded Unix TCP/IP stacks.

Most iSCSI initiators are new designs.  While some may ride
on present host TCP/IP stacks, it is likely that most
will be new designs or hosted by robust and up-to-date
TCP/IP stacks.

All storage applications are debugged in an end-to-end
manner impossible for most other applications.  The 
"write/read/compare" and performance testing required of 
storage eliminates almost all the bugs that were contributing
to the higher numbers found in the Stone and Partridge
environment.

That leaves only the residual error rate, which may allow
TCP/IP delivery of segments with undetected errors less
than one time in ten billion, not 1 in 16 million.

The most interesting problem is that most of the additional
verification and the packet CRC is calculated through the
same hardware stack that calculates the TCP/IP checksum and
is therefore susceptible to many of the same errors, but now
blessed by a valid CRC value and valid TCP/IP checksum.

The net I draw from this is that careful design is key to
success and that CRC or positionally 
dependent checksum on the iSCSI data packets is probably 
a good idea.  However, retry of iSCSI data packets may not
be necessary.

Bob 

>  -----Original Message-----
>  From: Michael Krause [mailto:krause@cup.hp.com]
>  Sent: Thursday, April 05, 2001 10:10 AM
>  To: Stephen Bailey
>  Cc: ips@ece.cmu.edu
>  Subject: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
>  
>  
>  At 09:12 AM 4/3/2001 -0400, Stephen Bailey wrote:
>  > > The Stone and Partridge paper is mostly not applicable 
>  to an iSCSI
>  > > environment.  The principal failure mechanisms were 
>  major software
>  > > bugs in the driver stack of PC-oriented machines.
>  
>  People make mistakes in all implementations.  Examination of 
>  other similar 
>  packet processing technology for mistakes is applicable to 
>  any effort and 
>  one should perform a risk assessment as to the probability 
>  of the mistakes 
>  being repeated here.  The fact that the mistakes were in PC-oriented 
>  machines is basically irrelevant and storage is not immune 
>  from having 
>  similar mistakes (have seen storage implementations that 
>  were just as poor 
>  in terms of quality as any other segment of the industry).
>  
>  
>  >I'm in complete agreement with Bob.
>  >
>  >I haven't seen a good analysis of TCP checksum escapes 
>  which resulted
>  >from intermediary manipulation (I haven't read the papers, but
>  >hopefully soon), but my hunch is that it's incredibly rare.
>  >
>  >An endpoint precipiated TCP checksum `escape' also escape a 
>  CRC or any
>  >other similar integrity check.  That is why I think all this
>  >additional integrity checking (on iSCSI headers & data), is an
>  >incredible amount of extra work (not just in computing the CRCs, but
>  >also in designing the SACK mechanism and recovery for 
>  digest failures)
>  >for no real gain.
>  
>  I agree that some of the recovery is overkill but disagree 
>  that error 
>  detection is as well.  At a minimum, one needs to have a 
>  strong end-to-end 
>  error detection mechanism.  Many believe a 16-bit checksum 
>  is not adequate 
>  to protect their data and given the importance of this data to our 
>  customers, most feel the specification must define such a 
>  mechanism (with 
>  some having strong feelings that this mechanism should NOT be 
>  optional).  Now whether we need to have 2 CRCs, etc. is a 
>  separate debate 
>  but they need to be there and most of us will require that 
>  they be used in 
>  any product / solution delivered to the customer.
>  
>  >The real loss is that it's immensely slowing time-to-market 
>  for iSCSI 
>  >(both in the front end specification and the back end 
>  implementation).
>  
>  A fast TTM solution that is not the highest quality 
>  (prevents silent data 
>  corruption) will lead to customer distrust and a repeat of 
>  the FC adoption 
>  rate - only 10 years later has it really started to 
>  penetrate customer 
>  solutions.
>  
>  
>  >A straw-man proposal (very unpopular given where we are, I 
>  know) would
>  >be to specify iSCSI without additional integrity checks (other than
>  >what you can get through security mechanisms, which is probably not
>  >visible to iSCSI anyway), and if that `fails' (I'm sure it 
>  won't), we
>  >can put an integrity shim between iSCSI and the transport.
>  >
>  >One example of how to do this would be Julian's TAF.  
>  Another would be
>  >the WARP RDMA layer.
>  
>  If another layer is put in place that provides data 
>  integrity, then it is 
>  redundant to do this at the iSCSI layer as well and this is 
>  one place where 
>  an option can be used, i.e. one negotiates the underlying 
>  framing mechanism 
>  (e.g. WARP) and if it is present, then iSCSI does not 
>  activate the CRC 
>  services.  If it is not, then it does thereby insuring that 
>  there is always 
>  end-to-end data integrity present in the solution.
>  
>  
>  >We don't have to specify how to do this now
>  
>  If this is to be supported then it should be specified now 
>  (can be done 
>  rather opaquely by just setting a "transport services" 
>  attribute for strong 
>  end-to-end data integrity protection.
>  
>  >, and the point is that
>  >it's hard to do so, because we really don't know what problem we're
>  >solving with it.  We're OK as long as we have a way to address it in
>  >the future without completely chucking what already exists.
>  >
>  >The other point to remember is that iSCSI still has to make the
>  >ID->Proposed->Draft->Internet traversal, and anybody that 
>  thinks it's
>  >going to do that on the first try is kidding themselves.  It's more
>  >important to get SOMETHING out there that exposes the implementation
>  >holes than to design a cathedral on paper.
>  
>  Nothing is perfect the first time out but in the tightening 
>  economy and 
>  increasing customer quality demands from the get-go, the 
>  trade-off between 
>  quality / reliability and TTM is not something people should rush to 
>  make.  The market is not what it used to be where good 
>  enough was alright; 
>  customers expect more today and with good cause.
>  
>  Mike
>  
>

Follow-Ups:
- RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
  - From: Michael Krause <krause@cup.hp.com>

Prev by Date: iSCSI: Exigent trivia
Next by Date: Re: iSCSI: Out Of Sequence due to null sequence with multiple connections.
Prev by thread: Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Next by thread: RE: iSCSI ERT: data SACK/replay buffer/"semi-transport"
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:05:09 2001
6315 messages in chronological order