Disclaimer
This implementation of BGP-4 is constantly improving and evolving, and
examples are not always immediately updated to reflect the most
current version. It is usually (though not always) safe to assume
that an example still runs properly, but the DML files and the output
files may look different than the way there are described in some of
the accompanying documentation.
Compatibility
This example is up-to-date with SSF.OS.BGP4 version 1.1.0.
Conventions
For an explanation of the debugging output used in this example, refer
to BGP
Debugging Output Conventions.
Overview
This example shows the behavior of a BGP speaker when one of its peers
is non-responsive.
Files
primary source: drop-peer.dml
library source: dictionary.dml
schema source: net.dml
raw output: drop-peer.out
Running
To run this example, SSFNet must be installed and the three source
files, as listed in the previous subsection, must be accessible. As
an example, if the three source files were together in the same
directory, the following command could be used to run the
simulation:
java SSF.Net.Net 500 drop-peer.dml dictionary.dml net.dml
Discussion
When a BGP speaker has a peer which is not responding, maintaining a
session with it wastes bandwidth. This is because messages, which are
no longer of any use, would continue to be sent to it. What's worse,
of course, is that if the peer is actually down, then data packets
which are forwarded to it will be lost. For these reasons, a BGP
speaker should cease communication with such a peer.
The exact behavior, according to RFC 1771 [1], is that a BGP speaker has a timer associated with each peer. Peers periodically send KeepAlive messages to each other to indicate their continued operation and connectivity. Each time a KeepAlive message is received, the associated timer is reset. If the timer ever actually expires, it is assumed that the peer is inaccessible and the communication with that peer should cease, thus terminating the session. (Though one last Notification message (used to indicate errors) is sent before all communication stops.)
drop-peer.dml is the primary DML source file used for this example. (It may be useful at this point to open that file in a separate window.)
The following debugging output from the simulation (abridged and
formatted for readability) shows how BGP behaves from the perspective
of the speaker doing the dropping. Notice that in
drop-peer.dml, the BGP speaker in Net 0 has a much
shorter (and far more reasonable) Keep Alive Timer Interval value than
the BGP speaker in Net 1. The one with shorter interval will
not get KeepAlive messages often enough from the other which will
cause it to terminate the session. A description of the output
follows. The BGP speaker in AS 1 is bgp@1:0 and the BGP
speaker in AS 2 is bgp@0:0.
001> 0.0 bgp@0:0 ID=0.0.0.17 AS#=2 NHI-AS=0 ASprefix=0.0.0.8/29 002> 0.0 bgp@1:0 ID=0.0.0.18 AS#=1 NHI-AS=1 ASprefix=0.0.0.0/29 003> 4.21E-4 bgp@0:0 snd keepalive to bgp@1:0 004> 4.21E-4 bgp@0:0 rst keepalive timer for bgp@1:0 005> 4.21E-4 bgp@0:0 rst hold timer for bgp@1:0 006> 4.21E-4 bgp@1:0 snd keepalive to bgp@0:0 007> 4.21E-4 bgp@1:0 rst keepalive timer for bgp@0:0 008> 4.21E-4 bgp@1:0 rst hold timer for bgp@0:0 009> 5.52E-4 bgp@0:0 rcv keepalive from bgp@1:0 010> 5.52E-4 bgp@0:0 rst hold timer for bgp@1:0 011> 5.52E-4 bgp@0:0 snd update advertising my AS to bgp@1:0 nlri=0.0.0.8/29 012> 5.52E-4 bgp@0:0 rst keepalive timer for bgp@1:0 013> 5.52E-4 bgp@1:0 rcv keepalive from bgp@0:0 014> 5.52E-4 bgp@1:0 rst hold timer for bgp@0:0 015> 5.52E-4 bgp@1:0 snd update advertising my AS to bgp@0:0 nlri=0.0.0.0/29 016> 5.52E-4 bgp@1:0 rst keepalive timer for bgp@0:0 017> 6.83E-4 bgp@0:0 rcv update frm bgp@1:0 nlri=0.0.0.0/29 018> 6.83E-4 bgp@0:0 rst hold timer for bgp@1:0 019> 6.83E-4 bgp@1:0 rcv update frm bgp@0:0 nlri=0.0.0.8/29 020> 6.83E-4 bgp@1:0 rst hold timer for bgp@0:0 021> 30.000552 bgp@0:0 exp keepalive timer for bgp@1:0 022> 30.000552 bgp@0:0 rst keepalive timer for bgp@1:0 023> 30.000552 bgp@0:0 snd keepalive to bgp@1:0 024> 30.000683 bgp@1:0 rcv keepalive from bgp@0:0 025> 30.000683 bgp@1:0 rst hold timer for bgp@0:0 026> 60.000552 bgp@0:0 exp keepalive timer for bgp@1:0 027> 60.000552 bgp@0:0 rst keepalive timer for bgp@1:0 028> 60.000552 bgp@0:0 snd keepalive to bgp@1:0 029> 60.000683 bgp@1:0 rcv keepalive from bgp@0:0 030> 60.000683 bgp@1:0 rst hold timer for bgp@0:0 031> 90.000552 bgp@0:0 exp keepalive timer for bgp@1:0 032> 90.000552 bgp@0:0 rst keepalive timer for bgp@1:0 033> 90.000552 bgp@0:0 snd keepalive to bgp@1:0 034> 90.000683 bgp@1:0 rcv keepalive from bgp@0:0 035> 90.000683 bgp@1:0 rst hold timer for bgp@0:0 036> 90.000683 bgp@0:0 exp hold timer for bgp@1:0 037> 90.000683 bgp@0:0 snd notification to bgp@1:0 038> 90.000814 bgp@1:0 rcv notification from bgp@0:0 |
We examine the simulation from the perspective of the BGP speaker in AS 2. The first non-zero time indicated in the output is 0.000421 (4.21E-4). Between time 0.0 and this time, the two BGP speakers were establishing a connection, or session, between themselves using TCP. By time 0.000421, this session was fully established. In line 3, we see that the BGP speaking router in AS 2 sent a KeepAlive message to the BGP speaker in AS 1. This type of message is sent periodically to let the peer know that the BGP speaker is still up and running. Lines 4 and 5 show that this BGP speaker also starts (resets) the KeepAlive and Hold Timers for its peer in AS 1. The KeepAlive Timer reminds the speaker to send a KeepAlive message every once in a while. The Hold Timer, if it ever expires, indicates that a neighboring speaker hasn't been heard from in a long time, and that we should assume that it is down. Line 15 indicates that at time 0.00052, the BGP speaking router at AS 1 sent an update message advertising the routing information about AS 1 to the BGP speaker in AS 2. In a similar fashion, the speaker at AS 2 advertises routing information to AS 1 (line 11). On lines 17 and 19 we see that the speakers receive each other's advertisements. Notice that they immediately reset their Hold Timers (lines 18 and 20), which they do each and every time they receive an indication that their peer is still up and running.
This might be a good time to note that, as specified in the DML file, the KeepAlive Interval for one speaker is greater than the Hold Timer Interval for its peer. This is cause for concern, since the Hold Timer could expire before the KeepAlive Timer at the other end even has a chance to force a keepalive message to be sent to maintain the session. (It should be noted that normally the Keep Alive Timer Interval values are calculated to be 1/3 of the value of the Hold Timer Interval, which itself is negotiated upon session establishment to be the minimum of the two proposed values (in this example, both proposed values are the same (90)). However, the Keep Alive Interval values are different here because explicitly configured values override the calculated values. This also serves the purpose of the example, of course.) If we look down through the output, we see that the KeepAlive Timer at the speaker at AS 2 expires and is reset every 30 seconds, along with the sending off of a keepalive message (lines 21/22/23, 26/27/28, and 31/32/33). Meanwhile, the speaker at AS 1 is receiving these messages but never sends any itself. Its KeepAlive Timer will not expire until time 100.0. But before that can happen, the Hold Timer at AS 2 expires, at time 90.000683 (line 36). (It was last set at time 0.000683 when the original advertisement from AS 1 was received (line 18).) With this expiration, a Notification message is sent to AS 1, indicating the error (line 37). When the Notification message is received at AS 1 (line 38), the session is terminated from that end as well, and there is no further communication.