SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: Comments to Comments!



    My turn to reply to both Julian and Costa - apologies
    for the delay involved.  I hope folks have a chance
    to read this prior to Adelaide, and I apologize for
    the fact that both the timing (last week of the
    calendar quarter) and distance from here make it
    impossible for me to attend the BOF in Adelaide.
    This email includes input from EMC folks in addition
    to yours truly.
    
    -- DNS:
    
    > Parties who wish to use IP addresses for extra reliability may encode the
    > IP addresses in the target names (e.g. 10.0.4.5/dvd). After all, domain
    > names can represent IP addresses. These type of domain names do not
    > require DNS for resolution.
    
    Keep in mind that the storage side of this is a closed box that may not be
    fully configurable.  Allowing the storage to use hostnames may force the
    initiator to perform host name resolution whether it wants to or not, and
    v.v.
    Host name resolution issues should be confined to hosts, with the storage
    kept out of this.  This objection is primarily about Open Data Connection;
    the use of DNS names may be defensible for third party copy.  Moving to
    a combined data/control connection model would go a long way
    towards resolving this.
    
    > It was felt that we should be completely independent of IP addresses
    > because of firewall and IP masquerading issues with setting up new TCP
    > connections. IP addresses /can/ be used, but only in dotted decimal
    > notation.
    
    But this doesn't solve the fundamental problem with NAT/NAPT (of which IP
    masquerade is an example) because there's no way of knowing whether
    the namespaces and conventions for name resolution match on both sides
    of the NAT/NAPT.  For example, if I pass "foo" as an identifier, it may
    resolve
    to foo.emc.com here and foo.eng.cisco.com there - FQDNs eliminate this
    simple example but cause other problems because DNS need not be a
    globally uniform namespace - in general one is now at the mercy of all
    sorts of peculiar DNS configuration oddities.  Non-DNS resolution
    mechanisms are not a magic cure - different YP/NIS domains and
    out of sync /etc/hosts files are capable of causing problems.  OTOH,
    the use of a combined control/data connection, and avoiding passing
    endpoint addresses in the payload makes this problem vanish, except for
    Third Party Copy, which is a much more involved story that probably
    requires a discussion of ALGs and some serious SHOULD NOTs.
    
    The notion of passing IP addresses as text strings reproduces one of the
    most irritating (in 20/20 hindsight) design mistakes of ftp.  The problem is
    that remapping an IP address may change the length of the text string,
    causing all sorts of complications (e.g., what if the packet is at MTU and
    the string gets longer?).  This won't work in a NAT/NAPT, and makes writing
    an ALG to put in such a box unnecessarily painful.
    
    > On the plus side, domain names decouple the iSCSI protocol from the
    > underlying addressing architecture. The potential deployment of IPv6 will
    > not require any changes to the iSCSI protocol.
    
    Good Grief!  Costa can't be serious about this as a reason.  Designing
    a variable length address field that accommodates IPv4 and IPv6 addresses
    is so easy, .. and besides TCP requires no changes for IPv6 and it has
    no clue about host names.
    
    -- Parameter negotiation
    
    > An implementation can ignore all free-form text/value pairs and still
    operate just fine.
    
    Provided that all the defaults are acceptable.  For example, Section 3.7
    says: 
    
         In order to allow write operations without RTT, the initiator and
         target must have agreed to do so by both sending the AllowNoRTT:yes
         key-pair attribute to each other (either during Login or through
         the Text Command/Response mechanism).
    
    In this case, the default (RTT required on write) appears to be correct, the
    bad news is that
    implementations that want to negotiate it away buy into the text processing
    by comparison to
    the small number of bits that are negotiated in FCP.  One of the more
    important defaults that
    must be accepted is No Authentication, but I think the key:value pairing is
    an ok way to
    support authentication - I'm concerned that it's overkill for the small
    number of bits that SCSI
    needs to negotiate.
    
    In general, this sort of arbitrary extensibility can be both a virtue and a
    vice because while
    extensions don't require on-the-wire format changes, they do require
    complicated rules
    about what key:value pairs are supposed to be in which message and how to
    deal with
    situations in which some of them are missing.  
    
    -- CRC
    
    > A CRC at the TCP layer or higher?
    
    Higher.  The read and write data need to be covered by a real CRC.  TCP and
    IP checksums are
    too weak to be acceptable, and routers strip/regenerate layer 2 checksums
    leaving no CRC to
    cover corruption in the router.  Restricting the CRC to data only avoids any
    requirement that an ALG
    recalculate it, as CRCs are much more difficult to adjust than the 1's
    complement TCP and
    IP checksums.
    
    -- Ping
    
    > > * What value does the ability to do an iSCSI ping add to the existing
    > > ability to do an ICMP ECHO?  If little or none, this should be omitted,
    see
    > > section 3.15.
    > 
    > It makes sure the iSCSI device server is still alive and kicking.
    
    I'm not sure about this one, as a SCSI Inquiry command seems to do about
    the same thing, and verifies that the iSCSI engine can actually do something
    SCSI, as opposed to just answer a ping.  One complication is that Inquiry
    can return a check condition that requires further action, in contrast to a
    self-contained ping.  
    
    -- Combined control and data connections
    
    > If we multiplex LUNs (as we do in the current draft) keeping to a short
    TCP
    > frame will leave as open to all sorts of troubles (possible deadlocks) due
    > to the limited TCP window and our lack of control over the data source and
    > sink. Separating the control and data stream we could resort to selective
    > resets to get out of trouble - while with a common connection we might
    have
    > to resort to radical means (e.g., closing connections).
    
    Multiplexing LUNs is the right decision to avoid massive proliferation of
    TCP session
    state.  Separating the control and data streams is not the only way out of
    "all sorts of troubles".  With combined control and data connections,
    holding
    another control connection open works for selective resets, with the
    possible
    exception of Abort Task (which may have to be issued on the connection that
    the task was initiated on).
    
    > In addition in a "permissive" environment (like a video server) we might
    > require CRC on the control connection while leaving the data connections
    up
    > to the user.
    
    But there's more than enough flexibility to negotiate this behavior.  In any
    case,
    EMC lives in a part of the world where omitting CRCs is a generally bad
    idea,
    even for video data.
    
    -- Killing all I/Os
    
    There are two important cases here.  In the first case, if the control
    connection times
    out and closes (or closes for any other reason), then clearly all I/Os have
    to be killed.
    The problem of concern is in the second case: opening up a new control
    connection
    CAUSES the old one to be closed as a side effect.  That seems unnecessary,
    especially
    because resets of various forms can be issued down the new connection to
    cause the device
    to clean up and get into a known state.    This interacts with the issue
    above about
    combining data/control onto one connection and allowing multiple connections
    between
    an initiator and responder pair.
    
    -- Target Name from Initiator
    
    > The target can ignore any key:value pairs sent by the initiator, so it
    need
    > not receive its name from the initiator. This feature is useful in case
    the
    > target is actually a front end for many machines and/or disks, in which
    > case the initiator can specify to which target it really wants to interact
    > with.
    
    I wonder if going there is a good idea, vs. something like a front end
    simply exporting each machine and/or disk on a different TCP port.
    The problem with handing the target its address inband is that the
    connection address no longer fully specifies what the initiator is
    talking to, and that seems wrong.
    
    --David
    
    ---------------------------------------------------
    David L. Black, Senior Technologist
    EMC Corporation, 42 South St., Hopkinton, MA  01748
    +1 (508) 435-1000 x75140, FAX: +1 (508) 497-6909
    black_david@emc.com  Cellular: +1 (978) 394-7754
    ---------------------------------------------------
    


Home

Last updated: Tue Sep 04 01:08:16 2001
6315 messages in chronological order