 
| 
 | 
 [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: A question on Zero CopyHello, I take the opportunity of this thread to say that the iSCSI protocol needs alignment. Why ? It is to save (a lot of) CPU cycles for the "software implementations" when they process the incoming (from the network) iSCSI PDUs. I mean by software implementations the ones that use a legacy Ethernet adapter or a generic TOE (TCP offload engine on the adapter) adapter. For these applications we cannot place the incoming data directly at the final place. A copy needs to be done to copy data from the anonymous buffers receiving the incoming iSCSI traffic to the final location. This copy is done by the CPU. To ease this painfull task, iSCSI would allow the CPU to copy the maximum for each cycle. That is 8 bytes with the current CPUs. Efforts have been made to keep the header with a size that is a multiple of 8 bytes, however, the PDU headers and data may not be aligned. Even if most of the time the data transferred is a multiple of blocks and headers size is a multiple of 8 bytes it doesn't guarantee the alignment. The alignment can be lost because of a command with parameters,sense data,... We actually saw that in prototypes. The problem is once the alignment is lost the data flow can stay not aligned for ever in the worst case. It is an important penalty for: - the initiators (host) using a TOE card or regular Ethernet card, as they have to copy (done by the CPU) all the inbound data to the final location. - a less important penalty is that the headers must be copied too. Because fields in the header have to be accessed as multibytes integer. And the CPU requires that integers must be aligned on a multiple of the integer size (else panic). It can impact the targets too, depending on how they are implemented. For example if the target uses a CPU to copy data from the "recirculation buffer" to the cache it will be impacted. To get the best performance, copy must use instructions that copy 8 bytes at a time (double word). Hence you use 8 times less CPU cycles than with a byte copy. These instructions require that the 8 bytes source must be aligned on an address that is a multiple of the double word size (8 bytes). Same thing for the 8 bytes destination. Even assuming that the copy is optimized (check the alignment to limit the number of instructions during the copy (use copy double if possible, then copy word, then copy halfword, then copy byte finally)) and assuming that the alignment is uniformely distributed the average size that can be copied in one CPU cycle is only: 8*1/8 + 4*1/8 + 2*1/4 + 1*1/2 = 2.5 bytes. If the double word alignment is guaranteed the size copied at each cycle would be 8 bytes. Hence the performance penalty is that 3 times more CPU cycles are needed to copy the data to final location compared to what is necessary. We have to add the penalty of checking the alignment (negligeable) and the penalty of copying the headers when they are not aligned. For the destination address alignment there is no problem. The buffers receiving the data (final location) are aligned at least on double word address. The problem comes from the incoming TCP byte stream where the iSCSI headers and data are not aligned on a 8 bytes multiple. [8 bytes] alignment here means: assuming that the first byte of the first PDU received on the connection is numbered 0, be "8 bytes aligned" means be (in the TCP payload) at an offset that is a mutiple of 8 bytes from the first byte of payload received on the connection. If the iSCSI headers and data are eight bytes aligned, the driver/adapter can be programmed to get the beginning of the iSCSI headers and data DMAed in memory on a double word boundary. Hence the copy can be 8bytes/cycle. To "align" it must be added the following rules to the specification: - the first byte of each iSCSI PDU header must be aligned on a multiple of 8 bytes (since the first PDU). This guarantees that the header are aligned and most of the data are be aligned. Because the data PDU and the SCSI response have a fixed size (48 bytes) header. - In the case of immediate data, with a command/parameters > 16 bytes the immediate data may not be aligned. May be we need another rule specifying that in this case the data must be aligned on 8 bytes multiple. I am not sure this rule is worth the gain. The first rule does change nothing in the specification, except that one line specifying the rule must be added. The transmitter needs only to pad (with anything) up to next 8 byte multiple before sending the next PDU. The receiver when at the end of a PDU, needs only to jump to next 8 byte multiple before interpreting the byte stream. This modification is cheap and save a lot of cycles in the "software implementations". Regards, Pierre 
 
 Home Last updated: Tue Sep 04 01:06:10 2001 6315 messages in chronological order |