SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI: Markers



    I'm new to this list, so I should introduce myself.
    
    My name is Stuart Cheshire; I'm the author of Consistent Overhead Byte 
    Stuffing (COBS), the framing technique from which COWS derives. I'm not 
    working on any iSCSI product, but if COBS can contribute to iSCSI, then 
    I'm happy to offer a little of my time, as much as I can spare, to help 
    clarify what COWS does and does not do, to help people make an informed 
    decision whether or not COWS is the right solution for iSCSI.
    
    ----
    
    Assumption: A high-performance receiver is harder than a high-performance 
    sender.
    
    This is because the sender is in control. It knows where the data is 
    coming from in memory, and where it is going to on the network. The 
    sending host knows and can control all aspects of the communication: what 
    order iSCSI messages are delivered onto the wire, how big each one is, 
    and at what time they are sent. If the sender wants to do some kind of 
    housekeeping that prevents it from sending packets for a few 
    milliseconds, then it has the option of doing that without terrible 
    consequences.
    
    The receiver has a much harder time. It never knows what packet is going 
    to arrive next, or how big it will be, or where it will be from, or where 
    it will have to go to in memory. Packet loss/corruption/reordering makes 
    things even more unpredictable. A receiver doesn't have the luxury of 
    being able to not receive packets for a few milliseconds if it is busy 
    with something else.
    
    For this reason, it makes sense to see what the sender can do to make the 
    receiver's life a little easier. If the receiver could receive each TCP 
    segment and process it in isolation, determining where to place it in 
    memory solely from information within that TCP segment, without reference 
    to data from other TCP segments (which may not have arrived yet), then it 
    would be easier to make a high-performance receiver.
    
    What can we do to enable independent segment processing and idempotent 
    direct data placement at the receiver?
    
    My first choice would be to add a couple of extra bits to the TCP header; 
    a "start of message" bit and an "end of message" bit. The "start of 
    message" bit indicates that the first byte of TCP data in the segment is 
    also the first byte of an iSCSI message; the "end of message" bit 
    indicates that the last byte of TCP data in the segment is also the last 
    byte of an iSCSI message. When a receiver receives a TCP segment with 
    both bits set, it knows with certainty that it has one (or more) complete 
    iSCSI messages in the TCP segment and can immediately decode enough of 
    the iSCSI message header(s) to determine where in memory to place the 
    data.
    
    Unfortunately, adding extra bits to the TCP header is not viable. From a 
    political point of view, trying to change the TCP on-the-wire protocol is 
    a non-starter. From a practical point of view, there are too many routers 
    and firewalls and similar devices that will throw away TCP packets with 
    bits they don't understand.
    
    Given that out-of-band framing using header bits is not possible, the 
    alternative is in-band framing using only information in the TCP data 
    stream itself.
    
    If we can design our sender to normally send exactly one iSCSI message 
    per TCP segment, and we have a way for our receiver to reliably verify 
    that the received TCP segment contains exactly one iSCSI message, then 
    the receiver can implement idempotent direct data placement for each TCP 
    segment as it is received, without reference to state from previous TCP 
    segments on that connection (which may not have arrived yet).
    
    The problem left to solve is how the receiver can reliably verify that 
    the received TCP segment contains exactly one iSCSI message. It can do 
    this by checking to see whether the TCP segment data begins with some 
    special marker pattern, as long as it knows that this special marker 
    pattern cannot appear anywhere within the body of valid iSCSI message 
    data. This necessarily entails processing ("stuffing") the body of the 
    iSCSI message to eliminate inadvertent occurrences of the special marker 
    pattern before sending, and then reversing this transformation to restore 
    the original data after reception.
    
    If the receiver finds that the segment does not begin with the special 
    marker pattern, then it knows that the sender segmentation has not been 
    maintained (or it is talking to an old TCP sender that doesn't support 
    sender segmentation) and it has to fall back to treating the TCP data 
    stream as a raw unstructured byte stream, with message boundaries 
    indicated by occurrences of the the marker pattern. The important thing 
    is that the receiver still works correctly, even though the performance 
    will be lower.
    
    This prefer-sender-segmentation-but-verify approach is important. If the 
    outgoing data is not processed to guarantee that the special marker 
    pattern cannot occur, then malicious users might be able to subvert the 
    protocol by putting contrived patterns in their data. Remember the days 
    where you could make a user's modem hang up by sending them an email 
    containing the text "+++ATH"? (Apologies to anyone reading this via modem 
    who just had their telephone line hang up.)
    
    Another benefit of using in-band framing like this is that we can deploy 
    it immediately using unmodified TCP stacks. In the future we can use 
    enhanced sender TCP implementations that take steps to maintain segment 
    boundaries, and smart receivers will get a performance boost from that, 
    but it is a compatible upgrade that changes only the implementation, not 
    the on-the-wire protocol.
    
    Of course, we don't get anything for free. If we want to receiver to be 
    able to determine with 100% certainty that it has received a complete 
    iSCSI message in one TCP segment, then the sender will have to do some 
    work to enable that. This is the cost of COWS. It gives 100% framing 
    certainty, but at the cost of checking the outgoing data for inadvertent 
    occurrences of the special marker pattern, and eliminating them. There's 
    no way for a sender to tell whether the outgoing data contains 
    inadvertent occurrences of the special marker pattern if the sender is 
    not willing to look at the data.
    
    On the plus side, the cost of COWS encoding is modest compared to some 
    alternatives. COWS-encoding adds a little header but otherwise doesn't 
    change the size of the outgoing data, ever. No matter how many 
    occurrences of the framing marker pattern are found, the encoded output 
    length is always exactly the same: the length of the input plus the 
    length of the fixed-size framing header (typically two words). If the 
    framing marker pattern is chosen to be something that is rare in normal 
    (non-malicious) data, then in the common-case the encoding step will be a 
    read-only operation: scan the data, determine that it contains no framing 
    markers, set the COWS header to indicate that the data contains no 
    framing markers, and send it.
    
    In contrast, when using Fixed Interval Markers, if a marker happens to 
    fall in the middle of the data you are sending, then it creates a 'hole' 
    in the middle of data that used to be contiguous, and the block of 
    outgoing data changes size. On the receiving side, the 'hole' created by 
    the marker has to be repaired in the process of transferring the data 
    into memory. When using Fixed Interval Markers, when a receiver gets a 
    TCP segment that contains no marker, it cannot reliably determine what it 
    is supposed to do with that segment (where to put it in memory) without 
    referring the state from the previous TCP segments of that connection. I 
    don't believe that FIM can provide efficient idempotent direct data 
    placement for inbound TCP segments, because you can't rely on any given 
    received segment containing a marker via which the receiver can verify 
    that the segment contains a complete iSCSI message.
    
    In summary:
    
    My first choice would be to modify the TCP protocol to support 
    preservation of upper-level message boundaries.
    
    Given that this is not possible, I think COWS provdes a good alternative.
    
    Stuart Cheshire <cheshire@apple.com>
     * Wizard Without Portfolio, Apple Computer
     * Chairman, IETF ZEROCONF
     * www.stuartcheshire.org
    
    
    


Home

Last updated: Sat Jan 12 18:17:54 2002
8376 messages in chronological order