SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI: error recovery



    
    
    
    
    My responses are embedded. Julo
    
    Matt Wakeley <matt_wakeley@agilent.com> on 25/10/2000 23:43:28
    
    Please respond to Matt Wakeley <matt_wakeley@agilent.com>
    
    To:   ips@ece.cmu.edu
    cc:
    Subject:  Re: iSCSI: error recovery
    
    
    
    
    Julian,
    
    My comments to your comments:
    
    julian_satran@il.ibm.com wrote:
    
    > Now for some issues I have with the (current) iSCSI draft:
    >
    > In section 2.2.2 it states "As the only cause for long delays in
    responses
    > can
    > be failed connections and received responses free-up resources, we felt
    that
    > score boarding responses at the initiator could be accomplished by simple
    > bitmaps and there is no need to flow-control responses."
    >
    > Score boarding, especially with bit maps,  is an operation that can be
    > somewhat CPU heavy in the normal "performance path" of the iSCSI layer.
    If
    > the ExpStatRN was local to each TCP connection, rather than global across
    the
    >
    > iSCSI session, then there would be no requirement for score boarding.
    The
    > initiator would simply increment the StatRN received on each connection
    for
    > use in the ExpStatRN for that connection.
    > /<JS>
    > I formulated it wrong.  As commands are associate with a connection at
    both
    > initiator and target
    > we always know what commands to reissue and there is no need to flow
    > control responses.
    > there is no scoreboarding involved.
    > <JS>/
    
    Julian, re-read the above text.  It is referring to Response messages, not
    Command messages.  If StatRN is global across the sesstion, then the
    initiator
    must perform scoreboarding of the StatRN values received in order to figure
    out
    what value to set the ExpStatRN to.  Consider the example of an iSCSI
    session
    consisting of 3 TCP connections (A, B and C).  The target queues up
    response
    messages numbered (StatRN) 1 thru 9 on the 3 TCP connections thusly: 1->A,
    2->B, 3->C, 4->A, 5->B, 6->C, 7->A, 8->B, 9->C.  Now lets assume that TCP
    connections A and C are relatively idle, but B is blocked (perhaps
    performing
    recovery due to a dropped frame).  The initiator will receive responses 1,
    3,
    4, 6, 7, 9.  The next ExpStatRN value the initiator can supply is 2.
    If/when
    response #2 is received, the initiator must supply ExpStatRN of 5, and when
    that's received, 8.  Thus, with a global StatRN, score boarding must be
    performed by the initiator on StatRNs received in order to supply the
    appropriate ExpStatRN.
    /<JS>
    
    I agree that it would be cheaper to number statuses per connection - as we
    number
    then only when in transit.  I will state it clearly in the new draft.
    
    <JS>/
    >
    > >From an earlier email: "1.1.1.3   Data PDU numbering
    > Incoming Data PDUs MAY be numbered by a target to enable fast recovery of
    > long
    > running READ commands. Data PDUs are numbered with DataRN.  NOP command
    > PDUs
    > carrying the same Initiator Tag as the Data PDUs are used to acknowledge
    > the
    > incoming Data PDUs."
    >
    > Since the only "error" we are trying to recover from is the very rare
    event
    > that a physical link fails, I fail to see what the benefit is to be able
    to
    > "recover" at the PDU level.  Plus, you'll have to build into the protocol
    a
    > mechanism to request retransmission of particular data PDUs.  Let's
    > simplify
    > and just send the command with the "retry" bit set.
    > /<JS>
    > You don't have to build a mechanism to request retransmission.
    
    I don't?  Then how is the target supposed to know that it must retransmit
    data
    that was lost due to connection failure?  How is the target to know what
    connection to transmit that data on?  Of course there needs to be a
    mechanism.
    A simple mechanism is to resend the command with the retry bit set.
    
    /<JS>
    
    The whole point about data numbering is to allow a target to discard
    read-data when read-data is hard to recover (as in tapes).  Once data is
    acked it is discarded. A target getting a command in restart mode will
    resend whatever data it has still buffered (not acked) and continue from
    there. The initiator would not e any wiser because it is not supposed to
    scoreboard.
    
    <JS>/
    
    >
    > I assume that a clever target will keep only unacked data (the whole
    point
    > of data PDU numbering is to lower the amount of data a target has to keep
    > for recovery).
    
    I don't think there is any advantage to retransmitting only data that was
    not
    "acked".  Let's say the data was being sent over a tcp connection to
    initiator
    HBA #1 of a multi tcp connection iSCSI session.  I think for simplicity
    most
    iSCSI HBAs will inform the host driver of completed I/Os, not "partial"
    I/Os
    (especially indicating what was received and what wasn't).  When the
    connection
    is "failed over" to another connection, perhaps running on a different HBA,
    it
    will be much simpler to retry the whole I/O over the new connection, rather
    than piece together two partial I/Os.  Simpler is better isn't it?
    
    > At command restart it will resent what it has.
    
    No, command retry (restart) was meant to retry the whole I/O.  It was not
    meant
    to be "send what you think I didn't get".
    
    > Obviously a
    > target may decide to ignore data acks (especially if it can reread the
    > media) and I assume disk targets will do just that and tapes will use the
    > acks.
    
    I assume by the time iSCSI is implemented, tapes will be behave just like
    discs.  Tapes are already moving to that model and there is a question of
    whether the "fc-tape" error recovery in FCP2 will be needed when it is
    completed.
    
    >
    > <JS>/
    > Also from an earlier email:
    >
    > > >Mallikarjun,
    > > >
    > > >Thanks for your comments.
    > > >
    > > >Initiator scoreboarding is not considered. I will try to emphasize
    this
    > > >even more in the new draft.
    > > >The party responsible for reporting length is the target.  As
    > overlapping
    > > >ranges are not explicitly
    > > >forbidden this would be a harder task than apparent. Reporting counts
    > > >becomes entirely a question of faith!
    > >
    > > I didn't realize that (what FC calls as) data overlay is allowed, FCP
    > > requires this initiator capability to be explicitly stated in session
    > > establishment (process login).  Is there a particular reason why this
    > > is chosen to be allowed by default in iSCSI?
    > >
    >
    > Again, in the interests of simplicity, I request that data overlay be
    > forbidden.  Period.  Otherwise, the initiator would have to perform score
    > boarding at the byte level to be positively sure that each byte was
    really
    > received.
    > /<JS>
    > That is an interesting point.  I would argue that in the interest of
    > simplicity
    > we will stay neutral.  If we explicitely forbid it the every Initiator is
    > bound to check (enforce) it and that is a lot of work.  I assume we will
    want
    > to
    > use SHOULD.  My point about scoreboarding is that initiators are not
    required
    >
    > to check (enforce) the overlap.
    
    I disagree Julian.  Forbidding a target from performing a function does not
    mandate that the initiator play policeman and verify that the target is not
    doing what it's forbidden to do.
    
    If we do not forbid it, then the initiator will have to support it, and
    that
    would be a lot of work.
    
    >
    >
    > <JS>/
    
    -Matt Wakeley
    
    
    
    
    
    
    
    
    


Home

Last updated: Tue Sep 04 01:06:35 2001
6315 messages in chronological order