[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: iSCSI/iWARP drafts and flow control

    • To: <>, <>
    • Subject: RE: iSCSI/iWARP drafts and flow control
    • From: <>
    • Date: Thu, 31 Jul 2003 13:42:00 -0600
    • Cc: <>, <>, <>
    • content-class: urn:content-classes:message
    • Content-Transfer-Encoding: quoted-printable
    • Content-Type: text/plain;charset="iso-8859-1"
    • Delivered-To:
    • Delivered-To:
    • Delivered-To:
    • Delivered-To:
    • Sender:
    • Thread-Index: AcNXjWQfPYoXpsOAEdesjwCQJ6pbUAAA9FVw
    • Thread-Topic: iSCSI/iWARP drafts and flow control

    All, sorry for the empty reply - I'm not sure how that happened.
    You asked some questions about how the other messages are flow controlled in iSCSI over TCP. The answer is that they aren't flow controlled. If iSCSI gets a PDU it cannot handle, it drops it and there are provisions to trigger it to be resent depending on the kind of recovery level supported. The only control for PDUs to the target is on non-immediate commands (both SCSI Command and Task Management Function Requeset PDUs). Note that when unsolicited non-immediate data is permitted, iSCSI allows the command to generate a command PDU plus an unknown number of SCSI Data-out PDUs to carry the unsolicted data. For iSER, we require that the unsolicted SCSI Data-out PDUs be full when there is enough unsolicted data to fill them (and we created a key to negotiate that size). Therefore, when operating over iSER the target does know the maximum number of PDUs that the initiator might send per SCSI command.
    There is no deadlock in existing iSCSI because there is no flow control on NOP-In and the target can always send a NOP-In to advance MaxCmdSN.
    To summarize, in current iSCSI, each opening in CmdSN window allows from 1 to ? PDUs while in iSCSI over iSER, each opening in CmdSN window allows from 1 to n PDUs where n is the amount of unsolicited data divided by data per PDU (rounded up of course).
    Note also that the CmdSN window is across a session. If you have connections in a session that are running over separate RNICs and are using CmdSN for flow control, each RNIC will have to have access to enough buffers for the whole window to land on it. 
    Between these two factors, CmdSN flow control will require over provisioning buffers much of the time. Perhaps memory is cheap enough that for an RNIC with a small number of connections this is acceptable in exchange for using an existing mechanism. On the other hand, we will have to create a mechanism to handle immediate commands and other PDUs that aren't covered by CmdSN so it isn't clear to me whether this is the right answer. The downside is overprovisioning buffers because of sessions spanning adapters and because each command might be a write with unsolicited data but many commands are reads. The upside is that CmdSN window can be managed to respond to changes in load while one has a less responsive simple mechanism to deal with the rest of the traffic.
    From target to initiator, the initiator knows that each command will generate a response PDU so it can provision that before it sends a command. 
    Login and text negotiation PDUs both ways (other than perhaps the first PDU to open text negotiation) also have a form of iSCSI flow control. For these, having sent a PDU, one can't send another until one has gotten the response.
    R2T PDUs are replaced by RDMA Reads in iSER so they are coverd by the RDMA Read flow control. SCSI Data-out for solicited and SCSI Data-in PDUs are replaced by RDMA operations so we don't have to cover them.
    What isn't flow controlled by iSCSI:
    initiator to target:
    immediate command PDUs - existing iSCSI allows for the target to drop these if it gets more than it can handle and the initiator can only count on buffering for two, but the initiator can send more than that and hope the target has buffering. One can't count on how many of these there might be.
    the first Text Request PDU - there can only be one
    SNACK request - in theory, one shouldn't need to send this when operating over iSER, but an initiator might send one if a timeout occurs.
    NOP-Out - one doesn't expect a lot of them, but there is no limit placed on them. 
    target to initiator:
    Asynchronous Message
    NOP-In - one doesn't expect a lot of them, but there is no limit placed on them.
    While there usually won't be a lot of these PDUs, one has no way to put a maximum number on how many there will be.
    There may be times when these PDUs are being sent and there is little or no command traffic. Also the replenishment of buffers for these has little relationship to the command processing. So I don't think one should link the replenishment to command related activities, e.g. advances in MaxCmdSN or reception of a command response.
    That is the problem space.
    Does one put a flow control into iSER that looks at opcode and, for commands, at the I bit to control just the PDUs that aren't limited by MaxCmdSN or does one put in a mechanism that covers all PDUs going through iSER?
    If one does the former, what happens when the transmitting iSER gets a PDU it have to stop because of the flow control and there are MaxCmdSN controlled PDUs after it? 
    Is iSCSI suppose to be flow control aware and manage the PDUs so that doesn't happen? 
    Does the whole connection transmit stop until more credit arrives?
    Are the MaxCmdSN controlled PDUs allowed to pass the flow controlled PDU and be transmitted?
    Is there a mechanism to disable flow control when the receiver doesn't require it, e.g. large shared buffer pool with statistical provisioning?


Last updated: Tue Aug 05 12:46:08 2003
12771 messages in chronological order