SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: iSCSI Data Integrity - Digests



    Michael,
    
    Your highest and second highest preference represents a change to the TCP
    protocol in that it mandates segment alignment of the PDU with the Ethernet
    frame.  The impact of changing TCP into a datagram protocol may not be well
    received.  If such a requirement exists, use SCTP.  If you wish to see two
    frame checks, one for the iSCSI header and one again for the SCSI payload, I
    would suggest that there are variables used as handshakes within the iSCSI
    header that could use even a simpler scheme than a 16 bit CRC, a 32 bit XOR.
    If done in software, both the TCP checksum and this header checksum will
    require updating.  A single CRC field will represent a time constraint in
    that a pipeline delay of iSCSI feedback will exist. Allowing these
    calculations to take place prior to being placed for delivery within an
    interrupt routine is the advantage of allowing a simple feedback update.
    
    Doug
    
    
    The following describes three alternatives for iSCSI data integrity.
    
    Mike Krause
    HP
    
    Requirements:
    Use existing / proven CRC algorithms and techniques to provide fast market
    enablement while avoiding a "reinvention of the wheel" exercise.
    Provide strong end-to-end data integrity for both iSCSI PDU header and data
    payload
    CRC is required for all implementations,  i.e. strong end-to-end data
    integrity is not an option as customers will not adopt solutions without
    such guarantees.
    Alternative 1 (Highest preference):
    An iSCSI PDU shall be restricted to a single TCP segment.  Multiple iSCSI
    PDU may be present within the same TCP segment but none shall span multiple
    segments.
    Each iSCSI PDU is protected by a trailing 32-bit CRC (Ethernet polynomial),
    i.e. a single CRC covers the entire iSCSI header and data.
    
    Assumption:
    iSCSI PDU header is not modified during transmission.  While there has been
    some discussion of a desire to provide such capabilities in the future,
    there are no current requirements to support requiring the specification to
    take this into account at this time. Should this become a requirement in the
    future, an intermediate endnode supporting iSCSI header modification would
    need to guarantee strong data integrity within its implementation using any
    of the well-known / deployed techniques.
    
    Benefits:
    Strong end-to-end data integrity using a well-known, proven technology.
    Low-cost, high-speed hardware implementations with readily available
    hardware cores can be created with minimal design complexity.
    Only one CRC which can be implemented in software mitigating the performance
    impacts this iSCSI data integrity would impose.
    
    Ability to accelerate software iSCSI implementations using a slightly
    modified NIC to perform the CRC calculation / verification for both inbound
    and outbound data streams.  This modified NIC would only require minor
    understanding of the iSCSI header, i.e. to identify it and locate the CRC
    within the data stream. The CRC can be verified while coming in off the
    "wire" or inserted while being placed on the "wire".   This technique is
    well understood since it is very similar to what is implemented by TCP
    checksum off-load implementations in use today.
    
    Note: A NIC implementing this functionality could combine the verification
    of the TCP checksum into a "one-stop" verification operation and silently
    drop invalid packets or tag them as "bad" for ULP processing.
    
    Solves the framing problem while eliminating the need for future support of
    "chunking" / RDMA technology.  Each PDU header contains sufficient
    information required for direct data placement providing the same benefits
    attributed to chunking / RDMA.  This will also allow simplified "bridge"
    solutions to be constructed, e.g. iSCSI-to-InfiniBand, iSCSI-to-SRP, etc.
    Eliminates the need to maintain intermediate CRC results (both inbound and
    outbound) reducing implementation cost / complexity.
    
    Eliminates bandwidth waste by reducing the number of bytes required to
    guarantee end-to-end data integrity while supporting multiple small PDU per
    segment (compaction)
    Provides improved QoS arbitration control / management - if a PDU were
    allowed to span multiple segments, then an implementation would need to
    transmit segments back-to-back (or very close) to deliver strong end-to-end
    performance / transaction throughput.  This may be implementation-specific
    but is still a tangible benefit for customers.
    
    If an intermediate endnode performs re-segmentation, a PDU may be span
    multiple segments.  This would be detected by a PDU CRC error providing a
    simple detection mechanism allowing implementations to recover either at the
    connection or session level.
    
    Constraints:
    iSCSI implementations must be able to determine each connection's MSS and
    create iSCSI PDU that fit within the MSS.  Such functionality is available
    in a variety of TCP implementations today and for hardware implementations.
    
    For the send-side retransmission problem (i.e. how to delineate packets
    within a byte stream), a hardware implementation is straight-forward to
    support since it provide the PDU-segment correlation.
    For a software implementation, the mbuf / mblk encompassing the iSCSI PDU
    would be marked to indicate whether the associated buffer should be sent
    within a separate segment or not.  This is not common to any TCP
    implementations to date but is not difficult to implement.  It should also
    be noted that this is an implementation not a TCP protocol issue.
    If a layer 4 intermediate endnode glues together two TCP streams and is not
    iSCSI aware, the send-side retransmission is a problem.  However, it is
    unclear whether this usage model must be transparently supported by iSCSI,
    i.e. such an intermediate endnode should be required to be iSCSI aware.
    This is not unreasonable as most layer 4 intermediate endnodes are providing
    some value-add service as a function of layer 4; why wouldn't such an
    endnode provide iSCSI value-add and thus be layer 5 aware.
    
    Alternative 2 (middle preference)
    An iSCSI PDU shall be restricted to a single TCP segment.  Multiple iSCSI
    PDU may be present within the same TCP segment but none shall span multiple
    segments.
    
    Each iSCSI PDU is protected by two CRCs - one invariant and one variant.
    The invariant CRC (ICRC) is a 32-bit CRC covering the PDU data and invariant
    header fields (e.g. address).  The variant CRC (VCRC) is either a 16 or
    32-bit CRC that covers the entire PDU header, data, and invariant CRC.  PDU
    layout would be: header, data, ICRC, VCRC.
    
    Note: This scheme is conceptually the same as what is used in InfiniBand
    providing customers and the industry with a single paradigm and improved
    technology integration for both compute and storage endnodes.
    
    
    Benefits relative to Alternative 1:
    Supports an intermediated endnode updating iSCSI header fields while
    supporting strong end-to-end data integrity of all invariant header fields
    and data.  It is critical that all invariant header fields such as target
    address be protected at all times to avoid silent data corruption / illegal
    memory access since these fields are used to DMA the data into / from target
    memory.
    
    Note: This problem does not exist in IP-based applications today since such
    implementations do not expose addresses across the wire but use look-up
    techniques as a function of the header.  iSCSI implementations may choose to
    use a similar technique but at the cost of increased resources / complexity.
    
    Limits the complexity / overhead required to support a separate header CRC -
    e.g. intermediate byte-stream CRC injection / verification.  This simplifies
    the hardware implementation for full off-load solution as well as provides
    the ability to create simplified CRC acceleration as described in
    alternative 1 for software-based iSCSI implementations.
    
    Use of two trailer CRCs does not impact overall end-to-end performance or
    endnode hardware resources.  Implementations are gated more by the memory
    subsystems / cache coherency overheads than by external wire speed
    transmission, i.e .the packet will, in general, arrive before one could
    complete the first few cache line fetch operations.   As such, given the
    single-segment operation, the data can be verified as it comes in off the
    wire and the memory operations initiated with minimal latency (most
    operations will be pipeline operations within a few cycles).
    
    An intermediate endnode can provide data integrity checks while data is
    in-flight and stomp the CRC should it detect an error.  This allows packet
    flow-through to be supported while providing fault isolation and a single
    for subsequent endnodes to drop invalid packets if they desire.
    
    Constraints:
    Invariant header fields must be identified and included within the ICRC
    calculation adding minor complexity to the overall implementation.
    
    Alternative 3 (least preferred):
    Allow a PDU to span multiple TCP segments.
    Implement two CRC: a header CRC and a data CRC.
    Do not allow intermediate endnodes to modify the iSCSI header.
    
    Constraints / Disadvantages:
    Increased implementation complexity and overhead.  The header CRC must occur
    following the header requiring injection / removal within the endnodes.
    This complexity is compounded for variable header protocols such as iSCSI
    and is why such a solution has been rejected in other high-speed
    technologies.
    
    Requires intermediate CRC state to be maintained for both inbound and
    outbound requests.
    Increased QoS scheduling complexity for strong end-to-end application
    throughput.
    Does not solve the framing problem perhaps necessitates the need for a
    chunking / RDMA solution.  This increases solution complexity and creates
    interoperability / support issues for customers, i.e. options are bad for
    developers; bad for customers.
    
    Severely limits creating high performance iSCSI software-based
    implementations perhaps making them impractical as a general purpose
    implementation.  This will limit the potential market for iSCSI solutions.
    
    Note: If an intermediate endnode is allowed to modify the PDU header, then
    there exists a possibility of silent data corruption since the invariant
    portions no longer have end-to-end data integrity.  This will be a major
    issue for customers in terms of their ability to adopt iSCSI across a
    variety of solution spaces, i.e. if there is the potential for silent data
    corruption, then customers will not deploy iSCSI and will turn to
    alternatives that provide stronger end-to-end data integrity.
    
    


Home

Last updated: Tue Sep 04 01:05:38 2001
6315 messages in chronological order