SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: Choice of ESP alg. for IPS/IPSec - 3DES-CBC vs. 3DES-CBC-I



    In our analysis of algorithms, we have been constrained by the transforms 
    existing or under development by IPsec WG. In general, the IPsec WG takes it 
    lead from NIST/ANSI, which looks not only at performance and 
    implementability in hardware and software, but also security and 
    intellectual property issues. To reference a given algorithm in the IPS 
    security draft, we need to be able to reference an IPsec WG transform 
    document, and that in turn tends to be dependent on standardization of the 
    algorithms and modes in question.
    
    Thus, the algorithms that are specified in the draft all correspond to 
    previous or current algorithms under consideration by NIST, for which IPsec 
    transform documents exist or are currently under development. You will 
    notice that these algorithms are not necessarily the optimal ones (at least 
    judged by software performance metrics). For example, AES-OCB has 
    considerably lower cycles per-byte cost in software than AES-CTR mode + 
    CBC-MAC with XCBC extensions, though I'm told that they're roughly 
    equivalent in hardware.
    
    Another example of non-optimal algorithm selection occurs in the 
    authentication algorithms, where we have HMAC-SHA1 as a MUST and CBC-MAC 
    with XCBC extensions as a SHOULD. As I am sure you are aware, authentication 
    algorithms such as PMAC are MUCH more efficient to implement in hardware, 
    and algorithms such as UMAC are MUCH more efficient to implement in software 
    than the algorithms that we chose. The problem was that our understanding 
    was that neither PMAC nor UMAC was far enough along in the NIST process.
    
    Ultimately, the algorithms that end up in the final security document will 
    largely be gated by what IPsec transform documents can be standardized in 
    the necessary timeframe. We chose 3DES-CBC and HMAC-SHA1 as MUST implement 
    because they were already widely implemented and IPsec transform documents 
    exist which we can reference, although the performance of both algorithms is 
    less than ideal for 1+ Gbps operation. The argument was that everyone could 
    at least implement these algorithms, warts and all.
    
    Given that 3DES-CBC-I has already been standardized by ANSI, it may be 
    feasible to get an IPsec transform document written and adopted as a work 
    item by IPsec WG. If this can happen, then it would be possible to argue the 
    merits of this algorithm versus the other ones under consideration. Given 
    the prevalence of 3DES-CBC however, I suspect that the argument would be 
    over whether 3DES-CBC-I would become a MAY or a SHOULD implement, rather 
    than a MUST.
    
    
    
    
    
    >From: "Mukund, Shridhar" <Shridhar_Mukund@adaptec.com>
    >To: ips@ece.cmu.edu
    >CC: "Mukund, Shridhar" <Shridhar_Mukund@adaptec.com>
    >Subject: Choice of ESP alg. for IPS/IPSec - 3DES-CBC vs. 3DES-CBC-I
    >Date: Fri, 30 Nov 2001 18:15:15 -0800
    >
    >
    >Hello,
    >
    >   Re: Choice of ESP alg. in
    >http://www.ietf.org/internet-drafts/draft-ietf-ips-security-06.txt
    >
    >   Question:
    >        As noted, we need an algorithm implementable in hardware at speeds 
    >of
    >up
    >        to 10Gbps, as well as being efficient for implementation in 
    >software
    >at speeds
    >        of 100Mbps or slower. AES-CTR is an excellent solution. But then it
    >will take time to
    >        get approved and further time to get "time tested" before being
    >adopted. Even after
    >        adotion of AES-CTR, 3DES-CBC will need to co-exist for many years 
    >to
    >come.
    >
    >        3DES-CBC does not gracefully scale to 10Gbps for two reasons:
    >        1. Frequent rekeying at 10Gbps: This issue is discussed in depth in
    >the draft.
    >            Although very inconvenient, state-of-art IKE stacks (esp. when
    >running on off-load
    >            processor) can deal with it.
    >        2. Lack of pipeline-ability: The feedback loop dictated by CBC
    >prohibits pipelined
    >            high-speed VLSI implementation of  the 3DES-CBC engine.
    >
    >        The ANSI standard X9.52-1998 which specifies 3DES-CBC(TCBC) also
    >specifies
    >        an equally standard variant called TCBC-I(say 3DES-CBC-Interleaved)
    >with same
    >        security properties. The effort required to enhance existing 
    >software
    >and VLSI
    >        implementations of 3DES-CBC to 3DES-CBC-I is "minor". 3DES-CBC can 
    >be
    >realized
    >        simply thru' a degenerate usage of the 3DES-CBC-I module. On the
    >positive side, it
    >        brings "substantial" savings in multi-gig VLSI implementation.
    >        Was the candidate ESP algorithm 3DES-CBC-I (superset of 3DES-CBC)
    >considered
    >        for the SHOULD implement option? Eventually something like AES-CTR
    >will pervade,
    >        but for the interm this is indeed a low-cost option to get to 
    >speeds
    >up to 10Gbps.
    >
    >   Comments on the VLSI implementation:
    >        A 3DES(not 3DES-CBC) engine by itself is highly pipeline-able and 
    >can
    >pump 10Gbps
    >        even on an FPGA. However for 3DES-CBC, one has to wait for 3DES to 
    >be
    >completed
    >        on a given 64-bit symbol before commencing 3DES on the next symbol.
    >As a result,
    >        a "single" 3DES-CBC engine max throughput is somewhere above 1Gbps,
    >depending
    >        on the process technology.
    >
    >        As usual, there is a brute-force solution to the problem which
    >requires use of
    >        multiple 3DES-CBC units. These engines take up significant silicon
    >real estate. The
    >        implementation complexity is not just due to the multiplicity of
    >3DES-CBC units but
    >        more so due to all the "incidental" kitchen-sinks and bath-tubs 
    >that
    >get thrown into
    >        the cauldron to support the multiplicity: scheduler, buffers per
    >engine(think jumbo frames),
    >        keeping track of contexts (10Gbps traffic could all belong to the 
    >one
    >connection or
    >        multiple connections), latency, power, ...
    >
    >        3DES-CBC-I partitions the symbol stream into three sub-streams so
    >that a single
    >        engine with three pipeline stages can pump 3X throughput and hence
    >bring about a
    >        3X reduction in the kitchen-sink count and complexity.
    >
    >        Further more: At the time 3DES-CBC-I was conceived multi-gig
    >throughput at the
    >        network end-point was probably not anticipated(my guess). As a
    >result, they stopped
    >        at tri-partitioning or 3-levels of interleaving(my guess). After 
    >all
    >it is only the IP Storage
    >        application that is pioneering multi-gig IPSec throughput at the 
    >end
    >point. If we used
    >        8-levels of interleaving we can pump all 10Gbps of throughput 
    >through
    >a single engine
    >        using current process technologies. No kitchen-sinks, no bath-tubs!
    >
    >Thoughts, Comments, Concerns ?
    >
    >-Shridhar Mukund
    >
    >
    
    
    _________________________________________________________________
    Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
    
    


Home

Last updated: Mon Dec 03 11:17:39 2001
7981 messages in chronological order