SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    TCP checksum escapes and iSCSI error recovery design



    All,
    In designing the iSCSI error recovery mechanisms there has been considerable
    focus on general deficiency of the TCP checksum.  It seems that what iSCSI
    error recovery should really be based upon is the overall profile of TCP
    checksum escapes for the networks that will carry iSCSI traffic ("checksum
    escapes" being defined as those cases where corrupted data is passed through
    TCP and delivered to the upper layer protocol as good).  The design of iSCSI
    error recovery needs to start with a clear and agreed upon set of
    assumptions regarding the profile for TCP checksum escapes across space and
    time. Thus, the question is not simply "How often is corrupted data passed
    through TCP and delivered to iSCSI", but more along the lines of  "What is
    the distribution of TCP checksum escapes across network paths and through
    time?"
    
    ISCSI error recovery should not be over-designed nor under-designed and will
    be fundamentally different if the assumption is that checksum escapes or
    bursts of such occur every 5 minutes or 5 hours or 5 days, and if the
    assumption is that there are a few bad paths or that all paths are bad
    sometimes. 
    
    It would make sense for iSCSI to have minimal error recovery if the spatial
    profile (discussed below) of TCP errors were bi-modal (i.e. there are good
    paths and bad paths). IOW, if the checksum escapes happen deterministically
    always or not at all, iSCSI can simply drop the connection on a digest
    failure because the operation switched into error mode.  OTOH, we cannot
    adopt this simplistic approach if TCP can be in the shades of gray between
    the two modes.  Attached email discussion indicates that load levels and
    memory usage models along the path play a key role in determining this
    aspect. Employing an iSCSI-level digest/CRC appears to be the wise approach
    regardless of the TCP operational model (bimodal or otherwise), to detect
    escapes when they happen. IOW, the principle of "trust but verify" is apt to
    be applied here. The question is what is the appropriate action for iSCSI to
    take if the verification detects data corruption.
    
    To start this discussion toward a set of assumptions regarding TCP checksum
    escape profiles and an appropriate iSCSI error recovery design, this email
    includes responses from several individuals working in this area (Vern
    Paxson, Craig Partridge, Jonathan Stone), along with links to their original
    papers.
    
    Regards,
    Jim Wendt
    Networked Storage Architecture
    Network Storage Solutions Organization
    Hewlett-Packard Company / Roseville, CA
    Tel: +1 916 785-5198
    Fax: +1 916 785-0391
    Email: jim_wendt@hp.com <mailto:jim_wendt@hp.com>
    
    
    ----------------------------------------------------------
    Here is a copy of my original email to Vern Paxson, Craig Partridge,
    Jonathan Stone, and Jeff Chase:
    
    Hi All,
    I'm e-mailing this to Vern, Craig, Jonathan, and Jeff in hopes of gathering
    information around TCP checksum failure profiles and Internet data failure
    characteristics in general.  I'm Jim Wendt and I've recently started working
    on both the iSCSI error recovery and RDMA/TCP-Framing activities. I'm
    emailing you directly because of your specific research and experience
    regarding TCP checksum failures and behavior.
    
    My immediate interest is in regards to TCP checksum failure rates
    (undetected errors) and how iSCSI should handle these undetected errors.  As
    you probably already know, the iSCSI protocol is a mapping of SCSI device
    protocols onto TCP (or SCTP eventually).  I have read the three papers (When
    the CRC and TCP Checksum Disagree, Performance of Checksums and CRCs over
    Real Data - both versions, End-to-End Internet Packet Dynamics). 
    
    The iSCSI WG has come to the conclusion that the TCP checksum provides an
    insufficient level of data protection, and that a CRC-32 will be used for
    data integrity verification on each iSCSI PDU (the specific CRC polynomial
    to employ is being studied now). Thus, the assumption is that if corrupted
    payload data does pass through TCP and is undetected by the TCP checksum,
    then the corruption will be detected at the iSCSI PDU level via the CRC-32
    check.
    
    What we are considering now is what level and philosophy of error handling
    and recovery is put into iSCSI. It seems to me that fundamentally different
    error handling behaviors would be put into iSCSI based on a known rate or
    profile of occurrence of bad PDUs (those PDUs that are passed up through TCP
    as being OK, but for which the iSCSI level CRC-32 indicates a data error).
    Thus, iSCSI error handling might be defined differently if the expected PDU
    error rate is one every 5 minutes as opposed to one every 5 hours (or 5
    days). Also, the error handling might be different if errors are bursty
    (i.e. a sequence of bad PDUs) rather than evenly spread through time.
    
    I would appreciate hearing your thoughts (and any supporting data references
    I could employ) regarding the nature of TCP checksum failure profiles across
    time and space (i.e. network paths).   More specifically:
    
    * How does the Internet work in the face of TCP error escapes?  If there
    truly is a relatively high level of application level data corruption
    (perhaps 1 in 16M packets), then how does the Internet and associated
    applications manage to function?
    
    * What is the distribution of errors across network paths?  Are most network
    paths very good (low TCP error escapes) with only a few paths that are
    really bad, or are TCP error escapes more evenly distributed across all
    network paths?  If the spatial profile is that there are good and bad paths,
    then do the error escapes rates on these two classes of paths correspond to
    the high/low bounds for TCP error escapes (1 in 16 million / 1 in 10
    billion)?
    
    * What is the distribution of errors through time?  Do TCP error escapes
    occur individually and randomly through time, or are TCP error escapes more
    bi-modal where most of the time there are no errors and occasionally there
    is a clump or burst of TCP error escapes?  If the temporal profile for TCP
    error escapes is that there are good periods and infrequent but severe bad
    periods, then what is the duty cycle for these periods (how bad/good for how
    long), and what are the error escape rates during these periods?
    
    * What about a stronger TCP checksum?  I don't believe that anyone every
    actually employed RFC1146 (TCP Alternate Checksum).  Has there been any
    recent thinking about actually improving TCP's end-to-end data integrity
    checking.  I suppose that the existing middle box infrastructure won't allow
    for this. However, I'm considering submitting a draft and starting to push
    for a stronger TCP checksum or CRC, but I would like to get feedback from
    all of you on the technical feasibility and possible acceptance of this
    proposal before taking it to the public forums.
    
    * What about a stronger SCTP checksum? SCTP currently uses the Adler-32
    checksum. Perhaps an optional stronger CRC-32 should be defined for SCTP.
    Has there been any thinking in this direction? Again, I am considering
    pursuing this with the SCTP folks but would appreciate any feedback you have
    to offer first.
    
    Thanks,
    Jim
    
    Jim Wendt
    Networked Storage Architecture
    Network Storage Solutions Organization
    Hewlett-Packard Company / Roseville, CA
    Tel: +1 916 785-5198
    Fax: +1 916 785-0391
    Email: jim_wendt@hp.com <mailto:jim_wendt@hp.com>
    
    
    ----------------------------------------------------------
    Craig Partridge writes:
    
    Hi Jim:
    
    Here's my quick set of answers, which Vern, Jonathan and others can
    refine.
    
    >* How does the Internet work in the face of TCP error escapes?  If there
    >truly is a relatively high level of application level data corruption
    >(perhaps 1 in 16M packets), then how does the Internet and associated
    >applications manage to function?
    
    The answer is that applications appear to function because people don't
    notice the errors or resort to backups, or whatever.  It isn't clear
    exactly how we're managing to survive.  (Back in the old days when NFS
    had these problems, people used to assume a disk failure had occurred when
    in fact the network had trashed their data).
    
    >* What is the distribution of errors across network paths?  Are most
    network
    >paths very good (low TCP error escapes) with only a few paths that are
    >really bad, or are TCP error escapes more evenly distributed across all
    >network paths?  If the spatial profile is that there are good and bad
    paths,
    >then do the error escapes rates on these two classes of paths correspond to
    >the high/low bounds for TCP error escapes (1 in 16 million / 1 in 10
    >billion)?
    
    The answer is that there are good and bad paths.  On good paths you'll
    probably see less escapes than 1 in 10 billion -- you can probably treat
    them as essentially error free.  If you start seeing a modest number of
    errors, then either (a) there's a broken router in your path or (b)
    there's a broken end system.  If we had a better way of identifying the
    source of errors, I'd say that if you ever see an error from a host,
    declare it broken.
    
    >* What is the distribution of errors through time?  Do TCP error escapes
    >occur individually and randomly through time, or are TCP error escapes more
    >bi-modal where most of the time there are no errors and occasionally there
    >is a clump or burst of TCP error escapes?  If the temporal profile for TCP
    >error escapes is that there are good periods and infrequent but severe bad
    >periods, then what is the duty cycle for these periods (how bad/good for
    how
    >long), and what are the error escape rates during these periods?
    
    They're not random but I don't think we know enough about the timing to say.
    My hunch is that a lot are based on end system load (that high loads
    on network cards tends to encourage certain classes of bus errors).
    
    >* What about a stronger TCP checksum?  I don't believe that anyone every
    >actually employed RFC1146 (TCP Alternate Checksum).  Has there been any
    >recent thinking about actually improving TCP's end-to-end data integrity
    >checking.  I suppose that the existing middle box infrastructure won't
    allow
    >for this. However, I'm considering submitting a draft and starting to push
    >for a stronger TCP checksum or CRC, but I would like to get feedback from
    >all of you on the technical feasibility and possible acceptance of this
    >proposal before taking it to the public forums.
    
    That's a tricky question and not one I've thought enough about to have
    a strong view -- except to say that it isn't all the checksum's fault --
    Jonathan's done work showing that putting the checksum at the end
    dramatically
    improves its efficacy in certain situations.
    
    >* What about a stronger SCTP checksum? SCTP currently uses the Adler-32
    >checksum. Perhaps an optional stronger CRC-32 should be defined for SCTP.
    >Has there been any thinking in this direction? Again, I am considering
    >pursuing this with the SCTP folks but would appreciate any feedback you
    have
    >to offer first.
    
    There's considerable reason to believe that Adler-32 is a lousy checksum --
    it uses more bits (32 vs. 16) and will probably detect fewer errors than
    either Fletcher-16 or the TCP checksum.  If one is going to invest energy
    in fixing a checksum, fixing SCTP's checksum is probably the first priority.
    
    Craig
    
    ----------------------------------------------------------
    Vern Paxson writes:
    
    Just a couple of follow-on points:
    
    	- I suspect the Internet survives because the bulk of the
    	  traffic isn't all that critical (Web pages, particularly
    	  large items like images and perhaps video clips), so when
    	  they're corrupted, nothing breaks in a major way.
    
    	- One point to consider is that if you use IPSEC, then you
    	  get very strong protection from its integrity guarantees
    
    - Vern
    
    ----------------------------------------------------------
    Jonathan Stone writes:
    
    In message <200104092135.f39LZVS21092@daffy.ee.lbl.gov>Vern Paxson writes
    
    Jim,
    
    Craig's answer is an excellent summary.  Fixing the SCTP checksum is a
    priority; I have a chapter in my thesis addresing some of the issues
    there, and I believe Craig and I plan to write a paper.
    Stronger TCP checksums are an interesting idea, but the lag on
    deploying new TCP features seems to be  5 years or more.
    
    
    If I re-did the study, i would try and log headers of ``good'' packets
    (those where the checksum matches) as well as the entire packets with
    checksum errors.  With only data on the `bad' packets (which is partly
    to meet privacy concerns, which were a big hurdle for us),
    it's hard to give good answers to your questions about the time or path
    dependency of bad checksums.
    
    One thing we can say is that there appear to be certain `bad' hosts
    which emit very high rates of packets with incorrect checksums.  One
    host we noticed in the Stanford trace sent 2 packets every second for
    around three hours, totalling about 20% of the total packets in that
    trace.  We can't say whether that's a host problem or a path problem;
    but either way that error rate would worry me, if I were using I-SCSI
    on that host (or path).  That said, we did find what looks to be a
    router with a bad memory bit, which is clearly path-dependent
    (though hitting that particular memory word may be time- and load-dependent
    as well).
    
    Further, some of the bad checksums appear to be due to software
    bugs in specific OS releases.  As the Internet evolves (old OS revs
    replaced by newer ones), the rate of those specific errors will
    evolve over time.
    
    
    Last, the DORM trace in our SIGCOMM 2000 paper is the only trace where
    our monitoring point was directly adjacent to end-hosts, with no
    intervening IP routers.  On that Ethernet segment, i saw about 1 in
    80,000 packets with a valid frame CRC, but a bad IP checksum --
    often a deletion of a 16-bit word from the IP address.  I tried
    to `repair' some  IP headers (e.g., by inserting what seemed
    to be a missing 16 bits of an IP source address).
    After the repair, the IP checksum seemed to be correct.
    
    That points to a problem ocurring after the sending IP layer computeds
    its IP header checksum, but before the frame-level CRC is computed.:
    That suggests an error either in the NIC device driver, in its DMA
    enigne, or somewhere on the NIC itself (dropped from a FIFO?).
    
    That led Craig and I to suggest that where possible, checksums be
    computed early and verified late.  That's a caution about a potential
    downside of outboard checksumming: it cover less of the path to an
    application buffer than is covered by software checksums.  Software
    checksums can catch errors which occur between the driver and the NIC
    (bus errors, DMA, FIFO overruns, what-have-you); but outboard hardware
    checksums do not.  (IEN-45 mentions this issue, but the two-pass
    scheme suggested there doubles bus utilization for a given I/O rate.)
    
    Putting middle boxes in the path just exacerbates this.
    
    Whether or not to use outboard checksumming is entirely up to
    NIC designers. We merely raise the issue that it checks less of
    the datapath than is covered by software checksums.
    
    
    >Just a couple of follow-on points:
    >
    >	- I suspect the Internet survives because the bulk of the
    >	  traffic isn't all that critical (Web pages, particularly
    >	  large items like images and perhaps video clips), so when
    >	  they're corrupted, nothing breaks in a major way.
    >
    >	- One point to consider is that if you use IPSEC, then you
    >	  get very strong protection from its integrity guarantees
    
    For a software IPSEC implementation.
    
    A hardware implementation with outboard crypto hardware could
    potentially fall foul of the same kinds of local-to-source-host
    errors, (DMA errors or whatever the ultimate cause is) which our data
    indicates some NICs suffer from.  If the data has already been
    curdled by the time the encrypting accelerator sees it,
    I don't see how IPsec actually enhances integrity.
    
    ----------------------------------------------------------
    Vern Paxson writes:
    
    > >	- One point to consider is that if you use IPSEC, then you
    > >	  get very strong protection from its integrity guarantees
    > 
    > For a software IPSEC implementation.
    
    Good point!
    
    		Vern
    
    ----------------------------------------------------------
    Craig Partridge writes:
    
    I'd like to clarify my statement to Jim about Adler-32 to be a bit more
    clear
    about what the issues are.   I've also added Mark Adler to the cc list as
    I'd promised Mark to get him results when we had them, and while the results
    aren't quite cooked, if the Jim is going to circulate a discussion, Mark
    should see it first.
    
    If you're designing a checksum, there are certain features you'd like it
    to have.  Here's a starting point of a formal definition.  Given the set
    V of all possible bit vectors and a checksum function C(), what we'd like
    is:
    
    	prob(C(v1) == C(v2)) is 1/2**(sizeof C())
    
    that is given any two v1, v2 being different elements of V, the chance that
    their checksum will collide is the best possible, namely 1 over 2 raised to
    the power of the bitwidth of the result of C().
    
    Three sub points:
    
        1. This is not quite the same as what the cryptographic checksum folks
        want.  They actually want it to be very hard [for some computational
        approximation of hard], given C(v1) to find a v2 such that C(v1) ==
    C(v2).
        For network checksums, we don't care as we're protecting from errors,
        not attacks.
    
        2. If we do not pick v1 and v2 at random, but according to some
    distribution
        rule of likely packet sizes, packet contents, etc, we'd still like the
        equation to be true.  We don't want to be vulnerable to certain
        traffic patterns, etc.
    
        3. You can compare the effectiveness of checksums by how close they
        come to this ideal -- that is, how effectively do they use their
        range of values?
    
    OK, so let's talk about Adler-32.  Adler-32 is a neat idea -- it seeks to
    improve the performance of the Fletcher checksum by summing modulo a prime
    number, rather than 255 or 256.
    
    However, it sums bytes (8-bit quantities) into 16-bit fields.  As a result,
    the high bits of the 16-bit fields take some time to fill (they only get
    filled by propogating carries from lower bits) and until the packet
    is quite big (thousands of bytes) you don't get enough mixing in the high
    bits.  Two problems: (a) we're not fully using the 16-bit width, so for
    smaller packets the chance of collision is much greater than 1/2**sizeof(C)
    simply because some bits in the checksum are always (or with very
    high probability) set to 0; and (b) it looks like (and Jonathan's still
    working on this) that the law of large numbers will cause the values to
    cluster still further [I think of this behavior as the result of just
    looking at all the bits instead of just the low order bits mod a prime,
    we're
    vulnerable to the fact that the sums are not evenly distributed, by
    the law of large numbers]
    
    We're still working on whether the core idea behind Adler-32 (namely working
    modulo a prime) is as powerful as it seems, but it is clear that to have
    a hope of making it comparable to the TCP checksum or Fletcher, you have to
    sum 16-bit quantities into 16-bit fields.
    
    Craig
    
    ----------------------------------------------------------
    Jonathan writes:
    
    In message <200104101505.f3AF5BZ23145@aland.bbn.com>Craig Partridge writes
    
    >I'd like to clarify my statement to Jim about Adler-32 to be a bit more
    clear
    >about what the issues are.   I've also added Mark Adler to the cc list as
    >I'd promised Mark to get him results when we had them, and while the
    results
    >aren't quite cooked, if the Jim is going to circulate a discussion, Mark
    >should see it first.
    >
    >If you're designing a checksum, there are certain features you'd like it
    >to have.  Here's a starting point of a formal definition.  Given the set
    >V of all possible bit vectors and a checksum function C(), what we'd like
    is:
    >
    >	prob(C(v1) == C(v2)) is 1/2**(sizeof C())
    >
    >that is given any two v1, v2 being different elements of V, the chance that
    >their checksum will collide is the best possible, namely 1 over 2 raised to
    >the power of the bitwidth of the result of C().
    
    Jim, Craig: 
    
    Just to be picky, as I'm working right now on definitions of some of
    these issues for my thesis:
    
    One can list other desirable properties, like wanting each bit of the
    checksum field to have informational entropy 1/2.  Craig's aggregate
    definition falls out from that, with a few extra assumptions.
    
    Also, less formally, desiring each bit of the input data to contribute
    equally to flipping the the final state of each output bit.
    that is where Adler32 runs into trouble when given short inputs.
    
    
    >Three sub points:
    >
    >    1. This is not quite the same as what the cryptographic checksum folks
    >    want.  They actually want it to be very hard [for some computational
    >    approximation of hard], given C(v1) to find a v2 such that C(v1) ==
    C(v2).
    >    For network checksums, we don't care as we're protecting from errors,
    >    not attacks.
    
    There are two versions of formal cryptographic invertibility; the
    other criteria is that it be computationally intractable to find *any*
    v1 and v2 such that C(v1) = C(v2).  Crypto folks would generally like
    both.
    
    
    >    2. If we do not pick v1 and v2 at random, but according to some
    distributi
    >on
    >    rule of likely packet sizes, packet contents, etc, we'd still like the
    >    equation to be true.  We don't want to be vulnerable to certain
    >    traffic patterns, etc.
    >
    >    3. You can compare the effectiveness of checksums by how close they
    >    come to this ideal -- that is, how effectively do they use their
    >    range of values?
    >
    >OK, so let's talk about Adler-32.  Adler-32 is a neat idea -- it seeks to
    >improve the performance of the Fletcher checksum by summing modulo a prime
    >number, rather than 255 or 256.
    >
    >However, it sums bytes (8-bit quantities) into 16-bit fields.  As a result,
    >the high bits of the 16-bit fields take some time to fill (they only get
    >filled by propogating carries from lower bits) and until the packet
    >is quite big (thousands of bytes) you don't get enough mixing in the high
    >bits.  Two problems: (a) we're not fully using the 16-bit width, so for
    >smaller packets the chance of collision is much greater than 1/2**sizeof(C)
    >simply because some bits in the checksum are always (or with very
    >high probability) set to 0; and (b) it looks like (and Jonathan's still
    >working on this) that the law of large numbers will cause the values to
    >cluster still further [I think of this behavior as the result of just
    >looking at all the bits instead of just the low order bits mod a prime,
    we're
    >vulnerable to the fact that the sums are not evenly distributed, by
    >the law of large numbers]
    
    To be fair, and to give the whole picture, there are a couple of
    points here that should be expanded.
    
    The first is that for large input data lengths, we can show that the
    distribution of both 16-bit halves of the Adler-32 sum should acually
    be well distributed.  That holds true for addition mod M, of any
    repeated independent observations of a random variable.  (A proof
    appears in the appendix of the 1998 ToN paper you cited already;
    although  that version of the proof may not formally state
    all the necessary assumptions about indepedent observations.)
    
    However, for networking in general and SCTP in particular, there are
    fairly modest hard upper boundes on the maximum input length.
    SCTP forbids fragmentation.  For the increasingly-pervasive Ethernet
    frame length of 1500 bytes that means SCTP checksums have no more
    than 1480 input bytes.
    
    The data we have -- and I am still working on it -- say that's too
    short to get good coverage of either the two sum in SCTP-- the purely
    additive commutative sum, or the higher-order running sum of the
    commutative sum (which is position-dependent)
    
    Craig's description gives a good intuition for the first, commutative
    sum. For the position-dependent sum (the running sum of the first sum),
    another good intution for the computational modelling I've done is to
    imagine a hash-table where we hash, independently, all possible values
    at all possible offsets in a 1480-byte SCTP packet.  The intuition
    isn't so much "law of large numbers" as that SCTP draws its per-byte
    values from one particular corner of a two-dimensional space (the
    hash-table vs. all possible bytes); so it ends up with an uneven
    coverage of the space of all hash values.
    
    
    >We're still working on whether the core idea behind Adler-32 (namely
    working
    >modulo a prime) is as powerful as it seems, but it is clear that to have
    >a hope of making it comparable to the TCP checksum or Fletcher, you have to
    >sum 16-bit quantities into 16-bit fields.
    
    Just so the reasoning is clear, another alternative is to sum 8-bit
    inputs into 8-bit accumulators, modulo 251.  Given 32 bits of
    available checksum field, the 16-bit sums are preferable.
    
    Again, this is for short inputs.  For large inputs of several tens of
    Kbytes, the Adler32 sum should do much better (at 64Kbytes it should
    give very uniform coverage).  At those input sizes, the comparison comes
    down to Adler32 having 15 pairs of values which are congruent, mod
    65521, whereas a Fletcher sum wiht 16-bit inputs would have only one;
    versus better `stirring' from a prime modulus.
    
    I dont know what the distribution of sizes of zlib-compressed files
    is.  If they are generally large, then our work may not be applicable
    to Adler-32 its original designed purpose.
    
    
    I also don't know the history of why SCTP chose the Adler-32 sum
    rather than, say, CRC-32. The gossip I hear from IETF-going friend in
    the Bay Area is that there was concern about the performance of
    CRC-32; a direct bit-by-bit shift-and-add was seen as too slow.  I
    hear there was also a supposition that an efficient table-lookup
    version would require very large tables (128 kbits?) and that
    tables that arge were prohibitive for PDAs and small handheld devices.
    
    I am *not* suggesting that is an accurate report; it probably isn't.
    But if there's any grain of truth in it, both four-bit and eight-bit
    table-lookup algorithms for CRC32 exist.  Table lookup size need not
    be an issue.  Perhaps we should draw the SCTP authors -- C. Sharp? --
    into this discussion as well.
    
    ----------------------------------------------------------
    Jonathan writes:
    
    A small terminological correction here:
    
    In message <200104101505.f3AF5BZ23145@aland.bbn.com>Craig Partridge writes
    >
    
    [...]
    
    >
    >so for
    >smaller packets the chance of collision is much greater than 1/2**sizeof(C)
    >simply because some bits in the checksum are always (or with very
    >high probability) set to 0; and (b) it looks like (and Jonathan's still
    >working on this) that the law of large numbers will cause the values to
    >cluster still further [I think of this behavior as the result of just
    >looking at all the bits instead of just the low order bits mod a prime,
    we're
    >vulnerable to the fact that the sums are not evenly distributed, by
    >the law of large numbers]
    
    I think its acutally a central limit theorem, not the law of large
    numbers.  For addition modulo M, the "central limit theorem" says that
    summation of many independent identically-distributed random variables
    will tend to a uniform distribution, not a normal distribution
    (as does the weighted sum)
    
    ----------------------------------------------------------
    Jonathan writes:
    
    Jim,
    
    I forwarded my previos message to Chip Sharp.  (is the rfc2960
    address, chsharp@cisco.com, still valid?)
    
    He may wish to comment on the SCTP issues as well.
    
    
    ----------------------------------------------------------
    ----------------------------------------------------------
    Links to papers:
    
    End-to-End Internet Packet Dynamics, Vern Paxson
    <http://citeseer.nj.nec.com/cache/papers2/cs/11598/ftp:zSzzSzftp.ee.lbl.govz
    SzpaperszSzvp-pkt-dyn-ton99.pdf/paxson97endtoend.pdf>
    
    When the CRC and TCP Checksum Disagree, Jonathan Stone, Craig Partridge
    <http://www.acm.org/sigcomm/sigcomm2000/conf/paper/sigcomm2000-9-1.pdf>
    
    Performance of Checksums and CRCs over Real Data, Jonathan Stone, Michael
    Greenwald, Craig Partridge, Jim Hughes
    <http://citeseer.nj.nec.com/cache/papers2/cs/1909/ftp:zSzzSzftp.dsg.stanford
    .eduzSzpubzSzpaperszSzsplice-paper.pdf/performance-of-checksums-and.pdf>
    
    ----------------------------------------------------------
    
    
    


Home

Last updated: Tue Sep 04 01:05:07 2001
6315 messages in chronological order