SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    IP Storage (ips) San Diego Minutes



    IETF IP Storage (ips) Working Group Meeting Minutes
    San Diego IETF Meeting, December 11-12, 2000
    
    ---------- Monday December 11, 2000
    
    EMC will be sending out an IPR notice regarding a patent related to iSCSI
    and FCIP to the mailing list.
    
    Interim meeting being scheduled for week of January 15, to coincide with T10
    in
    Orlando - Grosvenor resort.
    
    -- Framework document - Mark Carlson (Sun)
    	Describes environments for IP Storage.  Includes terms, background
    on
    		various protocols.  This is a living document.
    	Currently more of a survey.
    	This document will coordinate with Naming and Discovery.
    	Looking for more co-authors, please contact Mark if you are
    interested.
    
    -- Framing discussion -- Randy Haagens (HP) and Allyn Romanow (Cisco)
    - Allyn and Randy were asked to compose this presentation by the ADs.
    	Purpose was to try to clarify the problem and present a range of
    solutions.
    - Framing is a common challenge with for both iSCSI, FCIP as well as non IPS
    	documents.  While framing is not explicitly required, a solution for
    a
    	more effective iSCSI specification is highly desirable.  The focus
    of
    	the presentation was understanding the requirements of framing (i.e.
    the
    	problem). Reaching consensus on a solution was not one the goals of
    the
    	presentation. Allyn started the presentation by pointing out that
    this
    	topic will also be discussed on Monday night in the TSVWG.
    - The problem: TCP reassembly can be costly, and in some instances not
    feasible.
    	Also, there is limited host memory and host bus bandwidth, so one
    wants to avoid
    	manipulating the data more than once.  Best would be one use of the
    bus and
    	memory - zero copy.  Note:  This is not the same as TCP zero copy.
    TCP
    	typically waits for all the data to arrive, and then copies the data
    to host. 
    - In outbound direction, data can be transferred directly from memory to the
    protocol
    	controller and out onto the wire.  In the inbound direction, when
    received out
    	of order, data has to be put in a reassembly buffer until all data
    is received.
    - One solution: Direct Memory Placement (Payload steering; data steering;
    RDMA) --
    	In order to conserve host memory bandwidth, CPU cycles and reduce
    on-board
    	memory requirements, it is desirable to deliver iSCSI data directly
    to host
    	buffers, avoiding the overhead of TCP reassembly buffers.  The TCP
    reassembly
    	buffer can be 250MB for a 10Gbps link with 200ms round-trip time.
    At 1Gbps,
    	reassembly is possible but very costly.  But at 10Gbps speeds or
    above,
    	reassembly is no longer feasible.  So, the goal is to get rid of a
    separate
    	TCP reassembly buffer.  Can decode ULP (iSCSI) headers and place
    payload
    	directly in host memory without intermediate buffers.  This would
    not be a
    	conventional NIC card; instead it would be very iSCSI aware, but it
    would not
    	necessarily process the iSCSI headers, but just use them to
    determine where
    	to place the data.  As in TCP, the iSCSI stream is presented to the
    iSCSI
    	protocol processor in-order.
    - In this solution, must address loss of ULP sync - when a segment
    containing a
    	ULP header is dropped or delayed, ULP sync is lost.  Direct data
    placement cannot
    	continue; data must be diverted to a reassembly buffer.  Goal is to
    recover ULP
    	sync at the next ULP header.  There are both TCP aware and TCP
    unaware solutions
    	to recovering ULP sync.
    - TCP unaware approaches:
    	a) SCTP - issues with this include lack of widespread deployment
    	b) Special Characters - requires byte by bytes processing
    	c) Fixed length ULP messages - Inefficient for short ULP messages
    	d) Periodic Marker - Best solution for this class of approaches
    		Sublayer of a framing protocol.  Manageable; relatively easy
    to
    		implement in hardware Marker 4 byte field number of ULP
    bytes
    		remaining in current PDU.  Marker inserted and removed by
    		framing protocol; e.g. iSCSI.  After loss of sync, locate
    next marker;
    		use to locate the next ULP PDU.  Markers are transmitted
    twice in a row;
    		ensures markers cannot be split by stream
    fragmentation/segmentation.
    - TCP aware Approaches
    	a) URGent pointer - disallowed
    	b) PSH bit - disallowed
    - Another TCP aware approach can be considered by the TSV working group.
    	The TSV working group works on small items in the transport area
    that
    	do not need a full working group as well as TCP/UDP transport
    issues.
    - Allyn Romanow presented a technique for demarcating message boundaries
    using a TCP
    	option.  This consists of using one of the reserved bits in the TCP
    header
    	to extend TCP to support this type of framing. Then can add up to 40
    bytes
    	before the TCP payload.  Problem is that these reserved bits are a
    scarce
    	resource; need to evaluate the need for the change.  Also any time a
    change
    	to TCP is proposed, there is tension, e.g. tension between the need
    to update
    	TCP and stability of TCP.
    - Procedure for standardizing a TCP option consists of 
    	a) The IESG has to approve new work items for the TSV wg.
    	b) Ask the Transport Services (TSV) working group to adopt this as a
    WG item
    	c) Pros-and cons will be discussed on the TSV wg mailing list. If it
    supported,
    		hopefully the spec will be wrapped at the next IETF (roughly
    3 month time
    		frame). If no support, it's dead. The advantage of the TSV
    wg is that
    	transport experts will be able to contribute feedback.
    	d) If supported, will be adopted at next IETF meeting.
    
    Advantage is that people who are experts in transport will be able to
    contribute, and
    that this will not be an  iSCSI specific solution.  IPS should follow this
    process and
    contribute.  Make sure that the solution (since not iSCSI specific) meets
    the needs
    of this group.
    
    This is a very common problem, that is worthy of consideration at the
    transport layer.
    Addresses areas beyond IPS.  The TCP option is not the only approach.  TCP
    header bits
    could potentially be used for framing.
    
    The flag approach may send many packets that are less than MSS. This is
    potentially a
    risky change to TCP.
    
    Message Boundary Option
    	Two approaches.  Not in drafts yet:
    - Flag approach --  Costa has written up; will post as draft.
    	The flag approach may send many packets that are less than MSS.
    This is
    	potentially a risky change to TCP.  ULP header is aligned with first
    byte of
    	TCP payload.
    - Offset Approach -- 4 bytes.  2 byte offset indicates offset into TCP
    payload of
    	first ULP header in the segment.  Write-up forthcoming.
    
    Discussion - Lead by Steve Bellovin
    	Steve Requested the group concentrate on Requirements.  The
    discussion raised
    	the following points:
    - Another option for alignment - periodic alignment instead of periodic
    marker.
    	There could be a requirement in iSCSI that an upper-layer header
    appear
    	every n kbytes in the TCP stream.  Padding could be used to make
    sure this happens.
    - This is not the first time that this issue has arisen, and there is value
    in a general
    	solution that is applicable to other protocols, even though this may
    take longer
    	to deploy.  The consensus in the room was that a general approach is
    preferable
    	to one specific to iSCSI.
    - Multiple message boundaries in a single TCP segment are not a problem.
    Once the
    	first boundary is found, the rest are found by examining the iSCSI
    (or other
    	ULP) headers.
    - If there is a large gap between message boundaries, the data in the gap
    will need
    	buffering.  Implementations may wish to consider this in setting
    maximum data
    	size for a PDU.
    - RDMA is different but related to this topic.  Any RDMA protocol will
    either incorporate
    	or assume framing.  It may make sense to spec a generalized RDMA
    protocol on
    	top of this framing mechanism.
    - Implementation of this sort of framing would be optional. 
    - A generic data framing protocol may also be a good place to put in a
    stronger
    	CRC than the 16-bit Internet checksum.  Drafts making specific
    proposals are
    	welcome.
    
    Steve Bellovin asked for a hum of the room on whether to solve the "framing
    problem" in
    an iSCSI-specific  way or whether to pursue a mechanism to add to TCP. The
    hum in the
    room was to do it in TCP.
    
    -- ISCSI document review - presented by Julian Satran.
    
    - Rough consensus has been reached on the session model - Symmetric with
    optional
    	multiple connections.
    - Login Session context - good understanding.
    - Login Security context - more work needed.
    - Commands, messages, tasks, and tags almost complete.  Items open - coding,
    some layout.
    - Response numbering scheme is well understood; complete.
    - The data numbering scheme has received no consensus.  It may be removed.
    Julian's
    	personal opinion is that it's optional and low cost with advantages.
    - For recovery, command restart and status well understood.  No consensus on
    	data recovery.  Digest not well understood; needs to be readdressed.
    - Text commands - negotiation mechanisms done.
    - Mapping moved to T10 (aliasing).  Dropped from iSCSI.
    - RDMA/Sync, Security/Authentication - all are still open issues.
    - Authentication - login phase must provide authentication. This was the
    consensus
    	at the last meeting.  Every iSCSI PDU must provide data integrity
    and
    	authentication.
    - A mechanism should enable optional end2end data protection/authentication.
    Would like
    	to use TCP  recovery in presence of error.  Digests can be activated
    at a higher
    	level.  Need a mechanism that can be activated on demand, ideally at
    login.
    - The current digest scheme needs to be changed.  Julian suggested using
    IPSec for data
    	integrity, since all the above mechanisms are provided by IPSec, it
    is a best fit
    	for what is needed and very cheap if use only what is needed.  Can
    insert own
    	policies, including policies that will verify integrity verses
    provide security
    	but use same mechanisms.  Policies will be addressed in next two
    weeks.
    - David:  IPSec does negotiation securely.  What is currently in the draft
    is most
    	likely vulnerable to man-in-the-middle attack.
    - Steve Bellovin indicated that the IPSec WG would be extremely opposed to
    any insecure
    	non-cryptographic algorithm being defined for IPSec.  Silicon must
    support SHA-1
    	or MD5 in order to do key negotiation.  There are active
    discussions/proposals
    	on how to do high speed encryption/negotiation.  Early in process;
    drafts not
    	yet standards, but worth looking at this.
    - Mark Bakke really wants to maintain the separate iSCSI header/iSCSI
    payload digests.
    	This separation is lost by moving to IPSec.  Gained data integrity
    is only as
    	good as the group is willing to pay.  Good integration with
    encryption. 
    - Can use IPSec in transport mode, which will provide end2end protection.
    Integrity is
    	required end2end, but security may not be.  Security may need to be
    removed at the
    	firewall/gateway, but need to still be able to verify integrity at
    the endpoints.
    	Can have multiple layers of IPSec if needed.  Comment from audience
    - not
    	recommended.
    - David Peterson of Cisco asked whether ACA will be mandated by the draft.
    The
    	consensus, after the discussion, is that iSCSI must support ACA but
    that a
    	device need not support ACA (Ralph Weber pointed out that few
    initiator use ACA
    	today). There was some grumbling because ACA is needed for reliable
    pipelining
    	of ordered commands in the face of errors.
    
    - There was a question on whether asynchronous event notification (AEN) was
    mandatory
    	to implement in iSCSI. Again, iSCSI transports must support
    asynchronous events
    	but iSCSI devices need not. Somebody pointed out that SCSI mode
    pages can be used
    	to regulate whether a device generate AENs.
    
    - Ralph Weber (T10 secretary) praised iSCSI for trying to advance the state
    of the art
    	in SCSI.
    
    -- iSCSI requirements --- presented by Marjorie Krueger (HP)
    
    T10 work on authorization will not be integrated into iSCSI; to the extent
    that SCSI
    provides authorization, that's T10's domain.  The fact that IP networks are
    less
    secure than typical SCSI environments have been in the past introduces
    additional
    issues that need to be addressed here in iSCSI.  T10 work will be used and
    referenced
    where applicable.
    
    It was noted that the point of iSCSI authentication and authorization was to
    control
    who was able to get to a target.
    
    -- Bootstrapping  -- presented by Prasenjit Sarkar (IBM)
    
    This document contains guidelines for how iSCSI boot clients connect to
    iSCSI boot
    server.  Included description of how to use existing techniques.  iSCSI boot
    clients
    need IP address, iSCSI boot server service delivery port name, default; LUN
    = 0;
    iSCSI initiator software.
    
    Boot process steps:
    			Client software stage
    				Use PXE or related bootp/tftp protocol to
    get iSCSI
    					initiator software
    			DHCP stage
    				Use DHCP to configure client IP address
    				Use new DHCP option to configure iSCSI boot
    server
    					service delivery port name
    			Discovery server stage
    				Use "to be defined" iSCSI delivery service
    to get iSCSI 
    
    There was a question on whether the boot client had to have IPsec, in light
    of the
    integrity proposal by Julian and security proposals by others. It is not
    required;
    bootp is sufficient.
    
    The absence of security requirements for boot was pointed out.  The current
    goal
    of the boot document is to remain neutral on security (neither mandate nor
    disallow).
    
    There was some question on what to do with the iSCSI session once a
    bootstrap program
    was done with it.  It was noted that it was probably simplest to close it
    and have
    the loaded program establish a new iSCSI session, but this is up to
    implementations.
    
    -- MIB presentation - Mark Bakke (Cisco)
    
    A group is forming to work on iSCSI MIB.  The scope is management of iSCSI
    as
    opposed to SCSI.  If necessary, a separate SCSI MIB (if one does not already
    exist)
    would be addressed separately.  The original MIB structure in the current
    draft is
    not adequate, and is being redone.  These revisions will also bring the MIB
    up to
    date with the current iSCSI draft.
    
    A question was raised about how FC-style zoning works with the MIB.  It's
    not clear how
    zoning fits into the iSCSI architecture.
    
    The MIB could be implemented on anything running iSCSI including initiator,
    target,
    and gateway.
    
    The FC HBA API available from SNIA might be of interest to this group.  It
    has a
    complete list of things management tools want to be able to see out of an
    initiator.
    
    
    ----- Tuesday, December 12, 2000
    
    -- Naming and Discovery Requirements - Mark Bakke (Cisco)
    
    Naming and discovery will specify target discovery but 
    would leave LUN discovery to SCSI mechanisms, such as REPORT LUNs. There was
    a bit
    of debate on this; why not go all the way and support LUN discovery in the
    naming
    system?  The counter-argument is based on layering: "Leave unto SCSI that
    which
    is SCSI's".  
    
    Scaling requirements include both small and large environments.
    Find targets by querying SNS.  Small environments do not require SNS.
    Hierarchical format, with Naming Authority.
    
    World Wide Unique Identifier
    Address composed of IP addr+TCP port+Target Name, URL like.
    Plan to apply for well known port for TCP.  In such a case, an address w/o
    TCP specified would default to this well known port.
    
    Format includes info on naming authority, including support for 'local'
    naming
    authority.
    
    Character set to be allowed?  Unicode?
    Recommend UI schemes for naming authority.
    Need to look at security issues.
    
    T10 issues - reservations, reset, LUN naming.
    Target reset discussion.  Noted that T10 is thinking of making target reset
    optional.
    
    Is breaking of a connection in iSCSI equivalent to a target reset?
    Consensus is
    no: the end of a session was equivalent to a target reset and would also
    cause any persistent reservations to be released.
    
    Naming scheme will allow multiple port and multiple initiator/target
    discovery.
    Will give list of targets + all paths to that target.
    
    Draft currently an individual submission - consensus (hum) taken, to be
    adopted
    as working group document.  No opposition hums.
    
    -- iSNS document presented by Josh Tseng, Nishan
    
    ISNS describes a scalable information facility for registration, discovery
    and
    management of networked facilities.
    
    ISNS follows a client/server architecture.  If client registers with name
    server,
    allows itself to be managed by the name server.
    
    Why needed? Simplifies storage management implementations.  Allows greater
    scalability
    over broadcast/multicast discovery methods.  Supports zoning.
    
    Next step - incorporate requirements/suggestions from IPS working group.
    Extend document for FCIP
    
    Access control - what is name server role?  Targets upload public key to
    name server.
    Enforced at the end node/target.  Supports both soft and hard zoning.
    
    How does it fit into discovery?  Naming and discovery team will look at this
    to see
    how well it fits.  Should this be maintained as a separate document vs
    incorporated
    into naming/discovery team?  Yes, this is a separate document because it
    supports
    more than just iSCSI.
    
    In reading the draft, reliance on WWN.  This draft would
    need to be redone to support WWUI of n&d requirements.
    
    Direction is one in which naming and discovery team approves of? Yes, close.
    
    Is there working group consensus as a base document; working w/ NDT group to
    produce
    a revised document, aligned with N&D, which would then be adopted as an
    official wg
    document.  Rough consensus - next revised version of document will become an
    official working group document.  Not unanimous.
    
    -- FCIP - Status and progress of FCIP. - Raj Bhagwat (LightSand)
    
    Current status - difference from previous presentation 
    Solution for bridging remote FC SAN islands. From FC point of view, appears
    to be 
    entirely an FC network.  Initially did not have congestion management
    (previous
    presentation).
    
    Draft overhauled to incorporate TCP as transport in order to address
    congestion
    management and recovery mechanisms.  In rev -00, PSH flag incorporated.
    Based on
    feedback from mailing list, this was eliminated and in -01, a new frame
    boundary
    mechanism introduced.  Topics under discussion -- QOS, security, MTU/MSS,
    Framing/synchronization, order of delivery, discovery, error recovery.
    
    Alignment with new project in T11 - FC-BB2.  FC-BB2 focused on issues
    outside the scope
    of the IETF, including link level issues.  Target date for completion - June
    2001.
    
    Much FC/IP work is being done on conference calls.  Conference calls are
    design
    team calls open to design team members and authors. Public review is on the
    mailing list.
    
    An FCIP device is a gateway between an FC SAN and IP network.  Discovery of
    FCIP gateway (device) and other FCIP gateways is currently via static
    configuration.
    Dynamic configuration support is envisioned, perhaps using iSNS.
    
    David Black will work with authors on revising the QoS text.
    
    -- iFCP - presented by Charles Monia, Nishan
    
    What is the difference between iFCP and FCIP?
    - FCIP is a tunneling model between FC SANs.  A conduit for FC frames to
    flow
    	transparently to FC network over IP backbone.
    - iFCP network model extends up to the FC storage device itself.  Uses a
    session model.
    	Consolidates FC storage switching and routing functions in the IP
    fabric.  Reduces
    	total cost of ownership, unifies network and storage management
    domains and exploits
    	IP technology investment.  Extend SAN over lan/man/wan distances.
    
    Next step -- complete the n_port session model.  Encapsulation changes for
    additional
    end-to-end error detection.
    
    The authors of iFCP would like to see it considered for adoption as a work
    group item.
    Adoption of iFCP as a work group item requires modification to the WG
    charter.  David
    Black requested input on this be set to the WG chairs.  Revising of the
    charter requires
    consultation of the area directors and working group chairs.
    
    After the presentation, Suggestions were made that iFCP and FC/IP should
    merge since
    they are so similar.  It was pointed out that the two protocols take
    different approaches.
    iFCP works by intercepting FC logins (connection requests) and modifying FC
    frames.
    In addition, it doesn't run FC routing protocols between FC SANs.
    
    Clarification of FCIP and iFCP - the latter is for FCP protocol mapping
    only, whereas
    FCIP can transport any FC upper level protocol.
    
    FC/IP works at a lower level than iFCP. It doesn't modify FC frames.
    
    FC/IP requires running FC routing/switching protocols between FC domains.
    
    Some thought that iFCP was a superset of FC/IP.
    
    There was a concern that the iFCP gateway would need to run IP routing
    protocols.
    It was eventually decided the iFCP gateway was just an IP host and didn't
    have to run
    IP routing protocols.
    
    	Other comments need to be sent to mailing list or chairs directly.
    
    -- Adaptation Layer presentation -- Randall Stewart, Cisco
    
    Randall Stewart's presentation introduced how the IPS protocols could be
    architected
    with an adaptation layer independent of the underlying transport (i.e. at
    least both
    SCTP and TCP).
    
    To do this, a uniform API boundary between the ULP and transport would need
    to be
    defined.  This would require many changes to all existing drafts.  APIs
    would need
    to be a message oriented type of mechanism.  Critical path would need to be
    done so
    that they would be protocol agnostic.
    
    Transport interface would need to provide methods for passing buffers
    to/from control
    of transport, e.g. for zero copy.
    
    	Adaptation layer would need to worry about 
    		Framing
    		Zero copy
    		Parallel paths
    		Message retrieval
    		Notifications
    	
    Must be very careful that this API would not make assumptions about the
    transport
    being used.  In adaptation model, would need to figure out how to overcome
    the issues.
    
    Randall would be more than glad to help by contributing both advice and/or
    drafts
    to bring about this sort of adaptation layer.
    
    A concern was expressed that the adaptation layer would add too many layers
    between
    iSCSI and TCP and that separate protocol should be done for SCTP.
    
    It was suggested that the CAM may be an inspiration for the adaptation
    layer.
    Others responded that the CAM is at the wrong layer, above iSCSI.
    
    


Home

Last updated: Tue Sep 04 01:05:47 2001
6315 messages in chronological order