RE: iSCSI Naming and Discovery

To: <Black_David@emc.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI Naming and Discovery
From: "Douglas Otis" <dotis@sanlight.net>
Date: Tue, 3 Oct 2000 17:23:27 -0700
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;charset="iso-8859-1"
Importance: Normal
In-Reply-To: <0F31E5C394DAD311B60C00E029101A0704101052@corpmx9.isus.emc.com>
Sender: owner-ips@ece.cmu.edu
David,

You seem to think there is value in not including a binary address as if
this will interfere with NAT.  As these would be defined by the realm
providing the service, there would be nothing gained in attempting to
replace these addresses with names.

> The emails from Daniel Smith and Josh Tseng provide
> a good start on a naming discussion.  With
> my co-chair hat off, let me try to add to it.
>
> The concepts surrounding NAT (Network Address Translation)
> figure strongly in this discussion.  RFC 2663 is a
> good source of background for anyone who needs it.

This warns about not depending on IP within the protocol.  This is to allow
responses which do not depend on the client knowing their IP.  The client
would be unable to know its own IP and there lies the rub.

> An attempt to group and summarize what I see in
> Daniel and Josh's emails:
>
> [1] Internet infrastructure is beyond the control
> of storage, including iSCSI [Daniel - a),b),f),g),h)]

Only two points of access are required.

1) Authentication (LDAP)
2) Portal

> [2] DNS is the naming structure of the Internet, and
> hence hosts exporting storage services have to be
> nameable via DNS [Daniel c)].

Only the Authentication Server should be required to be published on DNS.

> [3] Network Address Translation and Network Address
> Port Translation exist and have to be dealt with [Daniel
> d), Josh (1)].

No work is required with respect to a NAT.  Be sure not to make assumptions
about the IP address based on a local sense.  Any gateway may translate this
IP:Port into a different IP:Port in both directions in a transparent
fashion.  This is not a problem, it is a feature that allows connections
without regard to this function provided the source IP and port are not used
to authenticate.

> [4] Discovery at Internet scale is a hard problem.  A
> flexible naming mechanism keeps options open. [Daniel e)].

That discovery is solved in a very simple fashion.  It is called LDAP.

> [5] SCSI 3rd party commands need to name LUNs, including
> iSCSI LUNs [Daniel i)]

The mapping to those LUNs would be pre-defined via the LDAP server.  The
obvious advantage would be both security and speed.

> *[6] Source, destination, and contents of any packet
> on the Internet are public information [Daniel j)].

Because these packets are public, make an effort not to label the content.

> *[7] Security matters for SSPs [Daniel k)].

Through the use of LDAP, which is hierarchical, both the client and the
provider are allowed to control their realm via the hierarchy.

> [8] SSPs can be expected to connect a lot of clients
> to a single DNS address [Daniel l)].

Because DNS can not provide the control desired, specifying the use of an
LDAP server in providing the required information allows the requisite
controls.

> *[9] Identification of storage and authentication are
> separate problems and use separate names. [Josh (2)]

The association of the user:target would be done via LDAP.  The client could
control users and the SSP could control general access to their services.

> [10] Naming/routing information is necessary for proxies
> to identify which of the entities they are proxying for
> is involved in traffic

This statement is assumes that is better to have a dynamic routing system
for SCSI.  It is not better nor does it improve the scalability, performance
or security.  At this point in time, there is no SCSI Name Server (SNS) to
provide a means of making routes dynamic.  As the SCSI mapping provided to
the client at the time of authentication can be leased, routes could only
change in a deterministic fashion, but they could change.

> [1] and [2] are general descriptions of the Internet.
>
> [6], [7], and [9] are (mostly) about security, and aside
> from noting that [6] is incorrect (all of that info can
> be hidden by a security gateway using IPsec tunnels and
> both IPsec and SSL/TLS hide payloads), security discussion
> might be better deferred as I understand that there will
> be a serious security proposal in the next version of the
> iSCSI draft.

The only security that should be addressed beyond obtaining access to LDAP,
would be a means of authenticating the connection in an opaque manner
between both the client and the server.  This would be based on the secure
connection made to the LDAP server and the shared secrets obtained from this
database.

> The biggest underlying problem seems to be how to identify
> an Initiator or Target - it's part of [3], [5], [8] and
> [10].  [4] is about Discovery, which compounds the naming
> issues.

The system would startup using DHCP and obtain information about the
location of the LDAP server.  From that point, the LDAP server would send
the information about the SCSI services.  No naming would be required but a
convention of symbols for access or class objects should be devised for LDAP
retrieval.

> At a high level there are three basic ways to identify
> initiators and targets:
> - Transport address (e.g., IP address, FC port WWN).

You would not use the IP address from the client.  The IP of the service
provider MUST be routable and no name is required.  To obtain access to the
providers authentication server, this machine should publish on DNS.

As far as initiators, this would be identified via LDAP to ascertain user
information.  This is not an IP or URL matter.  The transport MUST provide a
point of access that is routable at the gateway.  The client needs nothing
more with respect to accessing the portal.  Once the portal is accessed, the
LUN addresses should have been assigned during the authentication.  The
portal will be required to verify permission based on this LUNs as
determined during the authentication process.  As such, there would be a
notification process sending the results privately to the portal of who has
been allowed in and what secret and lun is valid.

> - Identification information provided as part of session

Do not burden the transport layer with exchanging this information.  You
will be spend far too much time redefining what already exists.

>   establishment (e.g., username/password, certificate).

Again, a function for LDAP.  You should require a secure connection.

> - Some combination of the above two.
> The third alternative may be problematic if it leads to
> needing both the transport and identification information
> to determine identity.  The discussions I've seen seem to
> be using transport as a hint that may make the identification
> easier to verify, which seems like a reasonable optimization
> to relying on the identification information alone.  Moving
> beyond this (e.g., the *.xyz.com servers may only connect
> from addresses in the a.b.c/24 netblock) increases the
> amount of information that has to be configured (e.g.,
> that example is better left to firewalls to enforce).

Again, we are re-inventing.  There is not a good reason for re-inventing a
SCSI name server used in conjunction with a SCSI protocol proxy.  You could
just as easily obtain this information from an LDAP server in either a
secure or insecure fashion that provides the needed functions and
flexibility.  Unless your desire is to get into SCSI dynamic router
business, I urge, beg, and plead, no don't even think it.

> NAT mechanisms contribute to the problem by producing
> networks in which transport addresses aren't useful for
> identifying anything on the other side of the NAT.  Some
> NATs can be configured to make this problem somewhat simpler
> via static assignment of IP addresses in one domain to IP
> addresses in another.

Yes, these are nailed down address to allow access such as PRIVATE IP:Port
80 so that you can access an internal web server as example.

> NATs are a thorny subject in IETF - while they are widely
> deployed, not all the important protocols work through them;
> IPsec AH is the most notable example, and FTP requires a kludge
> (er, ah, ALG).  IMHO, restrictions on the use of NATs with iSCSI
> may be ok, provided that the consequences are clearly understood.

FTP must use the passive mode to get past the NAT.  It should not be a
problem for the SCSI transport.  I only hope that the NIH does not take over
to solve some perceived problem.

> NATs seem to be a large piece of the forcing function that is
> leading us away from the transport-based identification
> information used by other SCSI transports.  There are security
> consequences here (e.g., cryptography for session establishment
> may become mandatory to implement and use), and it will likely
> complicate discovery.

Not at all.  You can not be sure who is using what, unless you know the
user.  You can not determine the user from the IP.  Should you wish to boot
the machine, then this drive would only want to know how DHCP identified the
machine.  You have plenty of flexibility within LDAP to solve all of these
issues.

> For example, I noted that booting is an issue - if iSCSI always
> uses URLs to name storage, the result could be a situation in
> which a DNS server has to be operational and reachable in order
> to boot.  This seems wrong, and the obvious answer of using an
> IP address in a URL does not work through NATs, which
> was the original motivation for using URLs.

BOOTP/DHCP is handled by routers.  You simply broadcast 'what am I' and then
you receive you answer as well as where to go.  I know what the response
would be in my case.  Yes, you could have secondary LDAP servers to keep
things reliable.  Chances are you already do.

> Third-party naming is a tarpit.  Putting my WG co-chair hat
> back on for this paragraph only, I observe that global context
> for third party names is an unsolved problem in T10; in
> general, the Initiator of a 3rd party command must use
> names that resolve to the desired LUNs from the 3rd party
> command Target's naming perspective.  How to do this when
> an Initiator and Target don't share a naming context is
> unspecified :-).  While it would be a plus for iSCSI to solve
> this one via have global names for 3rd party commands,
> I don't think this is a requirement (and whatever we do
> will have to be worked through T10, as they have the final
> say on name formats).  WG co-chair hat now comes off ...

This is not a problem.  Simply define the LDAP structures and your done.

> IMHO, discovery is not getting enough attention.  The proposed
> naming scheme complicates discovery without a compelling
> solution; I'm concerned that the benefits may not justify the
> costs.  My quasi-random walk through this goes something like:
>
> - Not being able to find the boot volume because DNS is down
> 	or not responding seems to be a wrong answer.  The infamous
> 	World Wide Wait is bad enough for a browser; it's unacceptable
> 	for a reboot.

You would not depend on DNS especially a DNS outside of the facility to
obtain boot information in most cases.  You machine already can do a
BOOTP/DHCP.  From there, a TFTP may get you to the next step or you may
elect to specify a scheme that extracts the remaining information from an
LDAP server so that once the BIOS knows, you have something to talk about.

> - Hence the boot volume has to be locatable via IP, a TCP port
> 	(which could be implicit, e.g. the default iSCSI port)
> 	and a LUN (which could be implicit, e.g., LUN 0).  Unlike
> 	DNS, I could see putting this into an iSCSI HBA card BIOS.

You should investigate BOOTP/DHCP.

> - Consistency suggests the approach of using IP addresses as
> 	the primary means of identifying possible storage locations,
> 	with the possible addition of non-default TCP port numbers).

Big problem.  You have not booted yet so you don't have any clue what IP or
name should be used.  You don't even know about TCP yet.

> 	A nice consequence of this is that one can use ranges
> 	(e.g., all the storage is in netblock a.b.c/24 at the
> 	default iSCSI port, scan those 256 addresses as part of
> 	discovery on boot).  The corresponding range wildcarding
> 	mechanisms for URLs will be more complex.

That would not help you.  You still need to find the boot drive and you have
yet to.   Again, because most would wish to key the drive that attaches
after the initial boot drives to a user, LDAP is your solution.  Make a
standard structure than can be acessed by a driver that can talk SCSI
transport to determine all the required settings.

> Even if a centralized configuration repository is used (like Fibre
> Channel), this sort of address wildcarding still looks useful in
> managing the repository.  The netblock example may be too coarse
> a wildcard.

Think of LDAP as your hierarchical repository.

> Returning to the issues at the top of this message:
>
> [3] NAT become an issue for network designers/admins.  Storage
> becomes something else that they have to get the IP addressing
> correct for :-(.  There are precedents for this, as the
> default gateway and DNS resolver are already configured via
> IP addresses, and if those things move to different IP
> addresses, stuff breaks (a browser can be very unhappy
> if its host thinks 0.0.0.0 is the only DNS resolver).  The
> downside is that a centralized config repository containing
> IP addresses becomes a NAT issue - the ALG required to
> access that across a NAT is ugly enough that it may be
> necessary to configure the network so that this never
> happens (which is not the best answer, but may be workable).

I fail to understand the concern.  NAT is really not that difficult.

> [4] Discovery based on IP addresses looks like it works for
> boot volumes in a way that URLs don't and scales via wildcarding
> in a fashion superior to URLs.  An underlying assumption
> I'm making is that storage discovery doesn't need to match
> the scale of DNS, and hence centralizing config info isn't
> hobbled by NAT issues.

You could allow the driver an 'If all else fails, use a default IP and LUN
via BOOTP to obtain a boot. I don't like it much as it is not very secure.

> [5] Use of IP addresses would better match the other 3rd
> party addressing modes, and removes a dependency of the
> third party Target on DNS.  The problems created by NATs
> are similar to problems that already exist in 3rd party
> addressing, and hence this doesn't make things worse.

The problem that already exists is that these addresses are not IPs. Why
pretend?

> [8] An IP connection is identified by 2 IP addresses and
> two ports, the fact that several thousand of them go through
> a common DNS, or even a common IP address is not a problem.

Whether the port is well-known or assigned does not matter, the problem you
must still deal with is that there are 4G IP address of which to pick.
Unless you expect a default name and target address will get you in.  That
would be very dangerous.  You would not even be sure what system you would
be running at that point.  Please Enter Password.... Ha Ha fooled ya.

> With the exception of the comment on T10 and 3rd party
> naming, this is all IMHO.  Fire away ...


Doug

>
> --David
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
References:
- iSCSI Naming and Discovery
  - From: Black_David@emc.com
Prev by Date: Re: iSCSI Naming and Discovery
Next by Date: RE: SCSI URL scheme
Prev by thread: iSCSI Naming and Discovery
Next by thread: Re: iSCSI Naming and Discovery
Index(es):
- Date
- Thread
Home
Last updated: Tue Sep 04 01:06:51 2001
6315 messages in chronological order