PIM-SM

Multicast VPN

PIM-SM, PIM-SSM, and PIM-BIDIR are all supported inside the provider core for MVPN.

PIM-SM or PIM-SSM is the recommended PIM option in the provider core, because PIM-BIDIR is not yet supported by all platforms, PIM-SM, PIM-SSM, PIM-BIDIR and PIM-DENSE-MODE are supported inside the MVPN.

MVPN has the concepts of Multicast Distribution Trees (MDT). An MDT is sourced by a PE router and has a multicast destination address. PE routers that have sites for the same MVPN will all source to a Default-MDT and also join to receive traffic on it.

There is a distinction between Default-MDTs and Data-MDTs. A Default-MDT is a tree that is `always-on’ and will transport PIM control-traffic, dense-mode traffic and rp-tree (*,G) traffic. All PE routers configured with the same default-MDT will receive this traffic.

Data-MDTs are trees that are created on demand and will only be joined by the PE routers that have interested receivers for the traffic. They can be created either by a traffic rate threshold and/or source-group pair.

Default-MDTs must have the same group address for all VRFs that comprise a MVPN. Data-MDTs may have the same group address if PIM-SSM is used. If PIM-SM is used, they should have a different group address, as providing the same one could result in the PE router receiving unwanted traffic. This is a PIM-SM protocol issue, not an implementation issue.

 

show ip pim mdt bgp

show ip bgp vpnv4 all

show ip mroute

show ip pim neighbors

show ip pim vrf catalunya neighbors

show ip pim vrf catalunya rp mapping

 

PE_BARCELONA#show ip pim mdt bgp

Peer (Route Distinguisher + IPv4)	Next Hop

 MDT group 239.232.0.0

 2:1:1:10.0.0.2	10.0.0.2

 2:1:1:10.0.0.3	10.0.0.3

 2:1:1:10.0.0.4	10.0.0.4

2:1:1 indicates the RD-type (2) and RD (1:1) associated with this update.

The remaining part is the address used to source the BGP session.

Alternatively, `show ip bgp vpnv4 all’ can be used.

PE_BARCELONA#show ip bgp vpnv4 all

BGP table version is 24, local router ID is 10.0.0.1

Status codes:	s suppressed, d damped, h history, * valid, > best, i - internal,

	r RIB-failure, S Stale

Origin codes:	i - IGP, e - EGP, ? - incomplete

	Network	Next Hop	Metric	LocPrf	Weight Path

Route Distinguisher: 1:1 (default for vrf catalunya)

*> 20.0.1.0/24	0.0.0.0	0		32768 ?

*>i20.0.2.0/24	10.0.0.2	0	100	0 ?

*>i20.0.3.0/24	10.0.0.3	0	100	0 ?

*> 20.1.0.0/22	20.0.1.2	0		32768 ?

*> 20.1.0.0/16	20.0.1.2	0		32768 ?

*>i20.2.0.0/24	10.0.0.2	1	100	0 ?

*>i20.3.0.0/24	10.0.0.3	1	100	0 ?

*>i20.4.0.0/24	10.0.0.4	0	100	0 ?

*> 20.5.5.5/32	20.0.1.2	1		32768 ?

Route Distinguisher: 2:1:1

*> 10.0.0.1/32	0.0.0.0			0 ?

*>i10.0.0.2/32	10.0.0.2	0	100	0 ?

*>i10.0.0.3/32	10.0.0.3	0	100	0 ?

*>i10.0.0.4/32	10.0.0.4	0	100	0 ?

Step 2. Verify the global mroute table

Use `show ip mroute <mdt-group-address>’ to verify that there is a (Source,Group) entry for each PE router. As PIM-SSM is used, the source is the loopback address used to source the BGP session and the Group is the MDT address configured. Without traffic, only default-MDT entries will be visible.

PE_BARCELONA#show ip mroute 239.232.0.0

IP Multicast Routing Table

Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,

 L - Local, P - Pruned, R - RP-bit set, F - Register flag,

 T - SPT-bit set, J - Join SPT, M - MSDP created entry,

 X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,

 U - URD, I - Received Source Specific Host Report, Z - Multicast Tunnel,

 Y - Joined MDT-data group, y - Sending to MDT-data group

Outgoing interface flags: H - Hardware switched

 Timers: Uptime/Expires

 Interface state: Interface, Next-Hop or VCD, State/Mode

(10.0.0.1, 239.232.0.0), 00:18:40/00:03:14, flags: sTZ

 Incoming interface: Loopback0, RPF nbr 0.0.0.0

 Outgoing interface list:

 Serial 1/0, Forward/Sparse, 00:17:53/00:02:31

(10.0.0.2, 239.232.0.0), 00:17:52/00:02:44, flags: sTIZ

 Incoming interface: Ethernet0/0, RPF nbr 10.1.0.2

 Outgoing interface list:

 MVRF catalunya, Forward/Sparse, 00:17:52/00:00:00

(10.0.0.3, 239.232.0.0), 00:17:47/00:02:44, flags: sTIZ

 Incoming interface: Ethernet0/0, RPF nbr 10.1.0.2

 Outgoing interface list:

 MVRF catalunya, Forward/Sparse, 00:17:47/00:00:00

(10.0.0.4, 239.232.0.0), 00:17:46/00:02:44, flags: sTIZ

 Incoming interface: Ethernet0/0, RPF nbr 10.1.0.2

 Outgoing interface list:

 MVRF catalunya, Forward/Sparse, 00:17:46/00:00:00

Verify that the `s’ flag is set on each (S,G) entry, which indicates that this group is used in ssm mode. Verify that the `Z’ flag is set indicating that this PE router is a leaf of the multicast tunnel. When the router is a `leaf’ of a multicast tunnel, it has to do additional lookups to determine which MVRF to forward this traffic to, as it is basically a receiver for this traffic.

Verify the I flag is set for the remote PE(S,G) entry. This flag indicates that the router understands it is joining an SSM group. It is as though an IGMPv3 host had requested to join that particular channel.

Step 3. Verify PIM neighbors in the global table

Use the `show ip pim neighbors’ command on all PE and P routers to verify that the pim neighbors are setup properly in the global table.

PE_BARCELONA#show ip pim neighbor

PIM Neighbor Table

Neighbor	Interface	Uptime/Expires	Ver	DR

Address				Priority/Mode

10.1.0.2	Serial 1/0	00:18:36/00:01:21	v2	1 / S

The example above shows that PE_BARCELONA has correctly setup a PIM neighborship in the global table with the P router.

Step 4. Verify PIM neighbors inside the VPN

Use `show ip pim vrf catalunya neighbors’ on all PE routers to verify that the CE router is seen as a PIM neighbor and the remote-PE routers are seen as a pim neighbor over the tunnel.

PE_BARCELONA#show ip pim vrf catalunya neighbor

PIM Neighbor Table

Neighbor	Interface	Uptime/Expires	Ver	DR

Address				Priority/Mode

20.0.1.2	Ethernet0/0	00:18:30/00:01:27	v2	1 / DR

10.0.0.3	Tunnel0	00:17:40/00:01:18	v2	1 /

10.0.0.4	Tunnel0	00:17:40/00:01:19	v2	1 / DR

10.0.0.2	Tunnel0	00:17:40/00:01:19	v2	1 /

There is correctly a PIM neighbor with the CE router on interface Ethernet 1/0, and all remote PE routers are also seen as PIM neighbors over the tunnel.

Step 5. Verify the VPN Group to RP mapping

Use `show ip pim vrf catalunya rp mapping’ to verify that the PE router correctly learned the Group to RP mapping information from the VPN.

PE_BARCELONA#show ip pim vrf catalunya rp map

PIM Group-to-RP Mappings

Group(s) 224.0.0.0/4

	RP 20.5.5.5 (?), v2v1

	Info source: 20.5.5.5 (?), elected via Auto-RP

	Uptime: 00:10:34, expires: 00:02:24

The PE router has correctly learned the Group to RP mapping, which is used inside the VPN. Auto-RP is used here inside the VPN.

When all the above has been successfully verified, Figure 2 presents a conceptual overview showing the default-MDT reaching all PE routers with the multicast replication being done in the core of the provider network.

With only a default-MDT configured, traffic will go to all PE routers, regardless of whether they want to receive the traffic.

Electing an RP (in brief)

The PIMv2 RFC states:

“This specification does not mandate the use of a single mechanism to provide routers with the information to perform the group-to-RP mapping. Currently four mechanisms are possible, and all four have associated problems:

  1. Static Configuration: A PIM router MUST support the static configuration of group-to-RP mappings. Such a mechanism is not robust to failures, but does at least provide a basic interoperability mechanism.
  2. Embedded-RP: Embedded-RP defines an address allocation policy in which the address of the Rendezvous Point (RP) is encoded in an IPv6 multicast group address.
  3. Cisco’s Auto-RP: Auto-RP uses a PIM Dense-Mode multicast group to announce group-to-RP mappings from a central location.
  4. BootStrap Router (BSR): RFC 2362 specifies a bootstrap mechanism based on the automatic election of a bootstrap router (BSR). Any router in the domain that is configured to be a possible RP reports its candidacy to the BSR, and then a domain-wide flooding mechanism distributes the BSR’s chosen set of RPs throughout the domain.

In this post we will look briefly at 1,3 and 4.

Static Configuration

Configuring a Static RP is fairly straight forward. The same RP is configured on each router in the multicast domain. There are some variations here where you can assign groups to different RPs to distributre the multicast load in the network.

Auto-RP

Auto-RP makes use of two multicast groups ie 224.0.1.39 and 224.0.1.40. The first address is 224.0.1.39 is used by the candidate RPs to announce their availability to be the RP for all or some multicast groups. The mapping agents listen for the candidate RP announcements and then send out group to RP mappings to the multicast group 224.0.1.40 address.

BSR

Within each multicast domain at least one of the routers must be configured as a candidate-BSR. The candidate BSR(s) each send Boot Strap Messages(BSM) out of all its interfaces to the ALL-PIM-ROUTERS (224.0.0.13) address. When a router receives Bootstrap message sent to `ALL-PIM-ROUTERS’ it performs the following:

  1. If the message was not sent by the RPF neighbor towards the BSR address included, the message is dropped. (ie if a PIM router receives a message from the BSR on an interface which is not the route back to the BSR, the BSM is dropped.)
  2. If the BSM is received on an interface which is the outgoing interface back towards the BSR then forwarded out all PIM interfaces, excluding the one over which the message arrived, to `ALL-PIM-ROUTERS’ group, with a TTL of 1.

The original messages sent out by the BSR do not contain any RP information as the candidate-RP has yet to be configured. Once a candidate-RP has been configured it sends a unicast candidate-RP advertisement to the BSR which in turn sends out a BSM containing the RP information to all the routers in the multicast domain.

The BSR collects the information from all candidate RPs. It places the information for all candidate RPs into subsequent Bootstrap messages. The BSR performs the election of the active RP of each group range only for its own use. Each router in the domain is responsible for running the RP-selection hash algorithm on the candidate RP information contained in the Bootstrap messages.

Note:

BSR differs from Auto-RP in that BSR does not create or require that any state is maintained in the mroute table. When using Auto-RP a method must be implemented to dense-flood the RP information throughout the multicast domain. This can be done in one of two ways, either implement PIM-Sparse-Dense mode or use the Auto-RP listener feature.

PIM Assert Messages

Where multiple PIM routers peer over a shared LAN, it is possible for more than one upstream router to have valid forwarding state for a packet, which can lead to packet duplication. PIM does not attempt to prevent this from occurring. Instead, it detects when this has happened and elects a single forwarder amongst the upstream routers to prevent further duplication. This election is performed using PIM Assert messages. Assert messages are also received by downstream routers on the LAN, and these cause subsequent Join/Prune messages to be sent to the upstream router that won the Assert.

The above content was taken from RFC 4601

Building the shared and shortest path Tree

The Shared tree is rooted at the RP and the shortest path tree is rooted at the Source of the multicast traffic.

The following processes will be discussed.

  1. Building the shared tree between the receivers and the RP
  2. Forwarding unicast PIM Register packets from the source to the RP
  3. Building the shortest Path tree between the RP and the source
  4. Building the Shortest-Path Tree between the Source and Receiver

Building the shared tree between the receivers and the RP

  1. A multicast receiver expresses its interest in receiving traffic destined for a multicast group. Typically, it does this using IGMP or MLD.
  2. One of the receiver’s local routers is elected as the Designated Router (DR) for that subnet.
  3. On receiving the receiver’s expression of interest, the DR then sends a PIM Join message towards the RP for that multicast group.
  4. This Join message is known as a (*,G) Join because it joins group G for all sources to that group.
  5. The (*,G) Join travels hop-by-hop towards the RP for the group, and in each router it passes through, a multicast tree state for group G is instantiated.
  6. Eventually, the (*,G) Join either reaches the RP or reaches a router that already has (*,G) Join state for that group.
  7. When many receivers join the group, their Join messages converge on the RP and form a distribution tree for group G that is rooted at the RP.
  8. This is known as the RP Tree (RPT), and is also known as the shared tree because it is shared by all sources sending to that group.
  9. Join messages are resent periodically so long as the receiver remains in the group.
  10. When all receivers on a leaf-network leave the group, the DR will send a PIM (*,G) Prune message towards the RP for that multicast group.
  11. However, if the Prune message is not sent for any reason, the state will eventually time out.

Forwarding unicast PIM Register packets to the RP

  1. A multicast data sender just starts sending data destined for a multicast group.
  2. The sender’s local router (DR) takes those data packets, unicast-encapsulates them, and sends them directly to the RP.
  3. The RP receives these encapsulated data packets, decapsulates them, and forwards them onto the shared tree.
  4. The packets then follow the (*,G) multicast tree state in the routers on the RP Tree being replicated wherever the RP Tree branches, and eventually reaching all the receivers for that multicast group.
  5. The process of encapsulating data packets to the RP is called registering, and the encapsulation packets are known as PIM Register packets.
  6. At this stage, multicast traffic is flowing encapsulated to the RP, and then natively over the RP tree to the multicast receivers.

Building the shortest Path tree between the RP and the source

  1. Register-encapsulation of data packets is inefficient for two reasons:
    1. Encapsulation and decapsulation may be relatively expensive operations for a router to perform, depending on whether or not the router has appropriate hardware for these tasks.
    2. Traveling all the way to the RP, and then back down the shared tree may result in the packets traveling a relatively long distance to reach receivers that are close to the sender. For some applications, this increased latency or bandwidth consumption is undesirable
  2. Therefore when the RP receives a register-encapsulated data packet from source S on group G, it will normally initiate an (S,G) source-specific Join towards S.
  3. This Join message travels hop-by-hop towards S, instantiating (S,G) multicast tree state in the routers along the path.
  4. (S,G) multicast tree state is used only to forward packets for group G if those packets come from source S.
  5. Eventually the Join message reaches S’s subnet or a router that already has (S,G) multicast tree state.
  6. Packets from S start to flow following the (S,G) tree state towards the RP.
  7. While the RP is in the process of joining the source-specific tree for S, the data packets will continue being encapsulated to the RP.
  8. When packets from S also start to arrive natively at the RP, the RP will be receiving two copies of each of these packets.
  9. At this point, the RP starts to discard the encapsulated copy of these packets.
  10. The RP then sends a Register-Stop message back to S’s DR to prevent the DR from unnecessarily encapsulating the packets.
  11. At this stage, traffic will be flowing natively from S along a source-specific tree to the RP, and from there along the shared tree to the receivers.
  12. Where the two trees intersect, traffic may transfer from the source-specific tree to the RP tree and thus avoid taking a long detour via the RP.

Building the Shortest-Path Tree between the Source and Receiver

  1. Although having the RP join back towards the source removes the encapsulation overhead, it does not completely optimize the forwarding paths.
  2. For many receivers, the route via the RP may involve a significant detour when compared with the shortest path from the source to the receiver.
  3. To obtain lower latencies or more efficient bandwidth utilization, a router on the receiver’s LAN, typically the DR, may initiate a transfer from the shared tree to a source-specific shortest-path tree (SPT).
  4. To do this, it issues an (S,G) Join towards S.
  5. This instantiates state in the routers along the path to S.
  6. Eventually, this join either reaches S’s subnet or reaches a router that already has (S,G) state.
  7. When this happens, data packets from S start to flow following the (S,G) state until they reach the receiver.
  8. At this point, the receiver (or a router upstream of the receiver) will be receiving two copies of the data: one from the SPT and one from the RPT.
  9. When the first traffic starts to arrive from the SPT, the DR or upstream router starts to drop the packets for G from S that arrive via the RP tree.
  10. In addition, it sends an (S,G) Prune message towards the RP. This is known as an (S,G,rpt) Prune.
  11. The Prune message travels hop-by-hop, instantiating state along the path towards the RP indicating that traffic from S for G should NOT be forwarded in this direction.
  12. The prune is propagated until it reaches the RP or a router that still needs the traffic from S for other receivers.
  13. By now, the receiver will be receiving traffic from S along the shortest-path tree between the receiver and S.
  14. In addition, the RP is receiving the traffic from S, but this traffic is no longer reaching the receiver along the RP tree.
  15. As far as the receiver is concerned, this is the final distribution tree.

The above content was taken from RFC 4601

Pim Packet Formats

The following information was taken from the RFCs below.

  • RFC 4601(Obsoletes RFC 2362) — Protocol Independent Multicast – Sparse Mode (PIM-SM)
  • RFC 5059 — Bootstrap Router (BSR) Mechanism for Protocol Independent Multicast (PIM)

To read the RFCs please visit the links below.

ftp://ftp.rfc-editor.org/in-notes/rfc4601.txt

ftp://ftp.rfc-editor.org/in-notes/rfc5059.txt

The following PIM packets formats are described within rfc 4601 and 5059.

RFC 4601 PIM Packet Formats

  • Hello Message Format
  • Register Message Format
  • Register-Stop Message Format
  • Join/Prune Message Format
  • Assert Message Format

RFC 5059 PIM Packet Formats

  • Bootstrap Message Format
  • Candidate-RP-Advertisement Message Format

PIM Packet Header

All PIM control messages have IP protocol number 103. PIM messages are either unicast (e.g., Registers and Register-Stop) or multicast with TTL 1 to the ‘ALL-PIM-ROUTERS’ group (e.g., Join/Prune, Asserts, etc.). Candidate-RP-Advertisement messages are unicast to a BSR. Usually, Bootstrap messages are multicast with TTL 1 to the ALL-PIM-ROUTERS group, but in some circumstances (described in section 3.5.2 RFC 5059) Bootstrap messages may be unicast to a specific PIM neighbor.

The source address used for unicast messages is a domain-wide reachable address; the source address used for multicast messages is the link-local address of the interface on which the message is being sent.

The IPv4 ‘ALL-PIM-ROUTERS’ group is ’224.0.0.13′. The IPv6 ‘ALL-PIM-ROUTERS’ group is ‘ff02::d’.

All the PIM Packets have the common header below.

PIM-Common-header

The Pim Version is 2.

The Message type is one of those listed in the table below.

Message Type Destination
0 = Hello Multicast to ALL-PIM-ROUTERS
1 = Register Unicast to RP
2 = Register-Stop Unicast to Source of Register Packet
3 = Join/Prune Multicast to ALL-PIM-ROUTERS
4 = Bootstrap Multicast to ALL-PIM-ROUTERS
5 = Assert Multicast to ALL-PIM-ROUTERS
6 = Graft (Used in PIM-DM only) Unicast to RPF’ (S)
7 = Graft-Ack (Used in PIM-DM only) Unicast to Source of Graft Packet
8 = Candidate-RP-Advertisement Unicast to Domain’s BSR

The “Reserved” field is set to zero on transmission and ignored upon receipt.

The “Checksum” is a standard IP checksum.

PIM Hello Packet

The PIM Hello packet contains the PIM common header as described above as well as a series of optional fields namely. Optiontype, OptionLength and OptionValue. Multiple Options triplets can be transmitted in the hello packet.

PIM-hello

The OptionType is one of those listed in the table below.

Option Type Optiontype Description
OptionType 1 Holdtime
OptionType 2 LAN Prune Delay
OptionType 3 to 16 reserved to be defined in future versions of this document
OptionType 18 deprecated and should not be used
OptionType 19 DR Priority
OptionType 20 Generation ID
OptionType 24 Address List

Whether all of these options are implemented in IOS will be examined in subsequent posts.

Register Message Format

A Register message is sent by the DR to the RP when a multicast packet needs to be transmitted on the RP-tree. The IP source address is set to the address of the DR, the destination address to the RP’s address. The IP TTL of the PIM packet is the system’s normal unicast TTL.

PIM-register

Field Description
B The border bit – If the router is a DR for a source that it is directly connected to, it sets the B bit to 0.
N The Null-Register bit – Set to 1 by a DR that is probing the RP before expiring its local Register-Suppression Timer. Set to 0 otherwise.
Reserved2 Transmitted as zero, ignored on receipt.
Multicast Data Packet The original packet sent by the source.

The Register-Stop packet format

A Register-Stop is unicast from the RP to the sender of the Register message. The IP source address is the address to which the register was addressed. The IP destination address is the source address of the register message.

PIM-register-stop

Field Description
Group Address The group address from the multicast data packet in the original Register message sent to the RP
Source Address The host address of the source from the multicast data packet in the original Register message sent to the RP.

Join/Prune Message Format

A Join/Prune message is sent by routers towards upstream sources and RPs. Joins are sent to build shared trees (RP trees) or source trees (SPT). Prunes are sent to prune source trees when members leave groups as well as sources that do not use the shared tree.

PIM-join-prune

Field Description
Upstream Neighbor Address The address of the upstream neighbor that is the target of the message. For IPv6 the source address used for multicast messages is the link-local address of the interface on which the message is being sent. For IPv4, the source address is the primary address associated with that interface.
Reserved Transmitted as Zero, ignored on receipt.
Holdtime The amount of time a receiver must keep the Join/Prune state alive, in seconds.
Number of Groups The number of multicast group sets contained in the message
Multicast group address For format description, see Section 4.9.1. in RFC 4601
Number of Joined Sources Number of joined source addresses listed for a given group
Joined Source Address 1 .. n This list contains the sources for a given group that the sending router will forward multicast datagrams from if received on the interface on which the Join/Prune message is sent.
Number of Pruned Sources Number of pruned source addresses listed for a group
Pruned Source Address 1 .. n This list contains the sources for a given group that the sending router does not want to forward multicast datagrams from when received on the interface on which the Join/Prune message is sent.

Assert Message Format

The Assert message is used to resolve forwarder conflicts between routers on a link. It is sent when a router receives a multicast data packet on an interface on which the router would normally have forwarded that packet. Assert messages may also be sent in response to an Assert message from another router.

PIM-Assert

Field Description
Group Address The group address for which the router wishes to resolve the

forwarding conflict.

Source Address Source address for which the router wishes to resolve the

forwarding conflict. The source address MAY be set to zero for (*,G) asserts (see below).

R RPT-bit is a 1-bit value. The RPT-bit is set to 1 for Assert(*,G) messages and 0 for Assert(S,G) messages.
Metric Preference Preference value assigned to the unicast routing protocol that provided the route to the multicast source or Rendezvous-Point.
Metric The unicast routing table metric associated with the route used to reach the multicast source or Rendezvous-Point. The metric is in units applicable to the unicast routing protocol used.

IPv4 BGP multicast

IPv4 BGP multicast announcements can be somewhat confusing.  It took a while to get my head around it, so now that I have, I will try to explain.

I have set up the topology below.

Everything has been configured in the above scenario EXCEPT the IPv4 BGP multicast session between RP and ASBR-RP.  I want to show how you can change the multicast path using IPV4 BGP multicast.

lets start by understanding BGP AS2

The receiver sends a join for group 224.1.1.1, Net-Edge2 receives the igmp join and forwards a PIM JOIN to ASBR-RP.  We can now see a (*,G) entry in the multicast routing table as below.

ASBR-RP#show ip mroute
(*, 224.1.1.1), 00:24:06/00:02:42, RP 4.4.4.4, flags: S
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
FastEthernet2/0, Forward/Sparse, 00:24:06/00:02:42

Pretty straight forward so far.

Now lets look at BGP AS1 from an IPv4 routing perspective

AS1 runs ISIS as the IGP. IPv4 192.168.1.0/24 and 2.2.2.2/32 prefixes are advertised into ISIS. ASBR1 is the only device which runs BGP in AS1.  ASBR1 then advertises 192.168.1.0/24 and 2.2.2.2/32 to AS2 as IPv4 prefixes.  If we examine the IPv4 routing table on ASBR-RP we can see it has an entry for 192.168.1.0/24 and 2.2.2.2/32 as below.

ASBR-RP#sh ip route 192.168.1.0
Routing entry for 192.168.1.0/24
Known via "bgp 2", distance 20, metric 10
Tag 1, type external
Last update from 10.0.0.1 00:14:09 ago
Routing Descriptor Blocks:
* 10.0.0.1, from 10.0.0.1, 00:14:09 ago
Route metric is 10, traffic share count is 1
AS Hops 1
Route tag 1

ASBR-RP#sh ip route 2.2.2.2
Routing entry for 2.2.2.2/32
Known via "bgp 2", distance 20, metric 20
Tag 1, type external
Last update from 10.0.0.1 00:05:20 ago
Routing Descriptor Blocks:
* 10.0.0.1, from 10.0.0.1, 00:05:20 ago
Route metric is 20, traffic share count is 1
AS Hops 1
Route tag 1

nothing majorly complicated here.

We now setup an MSDP session using the basic config between RP and ASBR-RP using the commands below on RP and ASBR-RP respectivly.

RP

RP#sh run | in msdp
ip msdp peer 10.0.0.6 connect-source FastEthernet1/1 remote-as 2
ip msdp cache-sa-state

ASBR-RP

ip msdp peer 10.0.0.5 connect-source FastEthernet1/0 remote-as 1
ip msdp cache-sa-state

no rocket science here.

Now lets kick off some multicast traffic from the source destined to 224.1.1.1 from 192.168.1.1

We should see an entry in the multicast routing table of RP in AS1.

RP#show ip mroute 224.1.1.1
(*, 224.1.1.1), 00:00:06/stopped, RP 2.2.2.2, flags: SP
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list: Null

(192.168.1.1, 224.1.1.1), 00:00:06/00:02:53, flags: PA
Incoming interface: FastEthernet1/0, RPF nbr 11.0.0.1
Outgoing interface list: Null

Everything good so far.

MSDP should now advertise this source to its peer ASBR-RP in AS 2.  Lets check to see if this has happened.

ASBR-RP#show ip msdp sa-cache
MSDP Source-Active Cache - 1 entries
(192.168.1.1, 224.1.1.1), RP 2.2.2.2, BGP/AS 0, 00:01:50/00:05:29, Peer 10.0.0.5

yep we can see it.

Now lets check to see whether we can see the entry in the mcast table on ASBR-RP.

ASBR-RP#sh ip mroute 224.1.1.1
(*, 224.1.1.1), 00:12:56/00:03:11, RP 4.4.4.4, flags: S
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
FastEthernet2/0, Forward/Sparse, 00:12:07/00:03:11

(192.168.1.1, 224.1.1.1), 00:12:56/00:03:28, flags: T
Incoming interface: FastEthernet1/1, RPF nbr 10.0.0.1
Outgoing interface list:
FastEthernet2/0, Forward/Sparse, 00:12:07/00:03:11

Yep, we can see it, but notice that the incoming interface is being shown as fastethernet1/1 and the RPF neighbor is 10.0.0.1.  This is due to the fact that the PIM Join message was sent using the ipv4 routing table and as such was routed via the lower link in the diagram.  This is not what we were trying to acheive, we wanted the SPT tree to flow over the upper link, upon which we configured MSDP.  To fix this we simply setup a IPv4 BGP multicast session as follows on RP.

router bgp 1
no bgp default ipv4-unicast
bgp log-neighbor-changes
neighbor 10.0.0.6 remote-as 2
!
address-family ipv4 multicast
neighbor 10.0.0.6 activate
no auto-summary
network 192.168.1.0
exit-address-family

Check on ASBR-RP that the multicast IPv4 prefix is being learnt.

ASBR-RP#sh ip bgp ipv4 multicast
Network          Next Hop            Metric LocPrf Weight Path
*> 192.168.1.0      10.0.0.5                10             0 1 i
ASBR-RP#

Its also worth noting that this prefix does not enter the unicast routing table as you can see below.

ASBR-RP#sh ip route 192.168.1.0
Routing entry for 192.168.1.0/24
Known via "bgp 2", distance 20, metric 20
Tag 1, type external
Last update from 10.0.0.1 00:04:22 ago
Routing Descriptor Blocks:
* 10.0.0.1, from 10.0.0.1, 00:04:22 ago
Route metric is 20, traffic share count is 1
AS Hops 1
Route tag 1

Now if we check the multicast routing table on ASBR-RP we will see the following.

ASBR-RP#sh ip mroute 224.1.1.1
(*, 224.1.1.1), 00:25:48/00:03:06, RP 4.4.4.4, flags: S
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
FastEthernet2/0, Forward/Sparse, 00:24:59/00:03:06

(192.168.1.1, 224.1.1.1), 00:25:48/00:03:27, flags: T
Incoming interface: FastEthernet1/0, RPF nbr 10.0.0.5, Mbgp
Outgoing interface list:
FastEthernet2/0, Forward/Sparse, 00:24:59/00:03:06

As you can see the incoming interface is now the upper link in the diagram and the RPF neighbor has changed to 10.0.0.5.

Therefore we can conclude the following, the IPv4 BGP multicast entries are used to send PIM Joins as well as conduct RPF checks on incoming multicast data traffic.  To further prove that the IPv4 BGP multicast entries are used to forward PIM Join messages we could remove the IPv4 prefixes which are being advertised over the lower link and see if we still have a SPT to ASBR-RP.

ASBR-RP#sh ip mroute 224.1.1.1
(*, 224.1.1.1), 00:01:19/00:03:26, RP 4.4.4.4, flags: S
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
FastEthernet2/0, Forward/Sparse, 00:01:03/00:03:26

(192.168.1.1, 224.1.1.1), 00:01:19/00:03:22, flags: T
Incoming interface: FastEthernet1/0, RPF nbr 10.0.0.5, Mbgp
Outgoing interface list:
FastEthernet2/0, Forward/Sparse, 00:01:03/00:03:25

Yep, its still there.

However, now that we have withdrawn the IPv4 prefixes from AS2 Net-Edge2 will no longer we able to build an SPT to the source as it has no route to 192.168.1.1 but rather will have to use the shared tree as you can see below.

net-edge2#sh ip mroute 224.1.1.1
(*, 224.1.1.1), 02:42:16/00:02:58, RP 4.4.4.4, flags: SJCL
Incoming interface: FastEthernet1/0, RPF nbr 12.0.0.1
Outgoing interface list:
Loopback0, Forward/Sparse, 02:42:16/00:02:17

(192.168.1.1, 224.1.1.1), 00:02:51/00:00:08, flags: LJ
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Loopback0, Forward/Sparse, 00:02:51/00:02:17

If we reinject the IPv4 prefix 192.168.1.0/24 using BGP on ASBR1 and then reexamine the multicast entry for 224.1.1.1 on net-edge2 we see the following.

net-edge2#sh ip mroute 224.1.1.1
(*, 224.1.1.1), 00:00:09/stopped, RP 4.4.4.4, flags: SJCL
Incoming interface: FastEthernet1/0, RPF nbr 12.0.0.1
Outgoing interface list:
Loopback0, Forward/Sparse, 00:00:09/00:02:50

(192.168.1.1, 224.1.1.1), 00:00:00/00:02:59, flags: LJT
Incoming interface: FastEthernet1/0, RPF nbr 12.0.0.1
Outgoing interface list:
Loopback0, Forward/Sparse, 00:00:00/00:02:59

As you can see net-edge2 is now using the (S,G) tree.  MAGIC:-)

Anyway, I hope that makes sense.

Multicast – PIM Sparse Mode

Consider the topology below.

Once your IGP and infrastrucutre links are all configured use the following to enable multicast sparse mode.

Step 1 – Enable multicast globally on all routers

multicast-routing

Step 2 – Enable multicast on all interfaces

interface Serial1/0.1 point-to-point
ip pim sparse-mode

Step 3 – Configure an RP on a devices

ip pim rp-address 2.2.2.2

When you enable pim sparse mode on an interface you will see the following.

R2#sh ip pim interface

Address          Interface                   Ver/       Nbr       Query    DR       DR
Mode     Count    Intvl     Prior
10.0.0.2         Serial1/0.1                  v2/S     1           30        1          0.0.0.0
10.0.0.5         FastEthernet2/0          v2/S     0           30        1          10.0.0.5
R2#

Notice on FastEthernet2/0 there are 0 neighbors.  This is due to the fact that we havent enabled PIM sparse mode on R3.

Lets enable PIM sparse mode on R3 and see what happens.

R3(config)#interface FastEthernet1/0
R3(config-if)#ip pim sparse-mode
R3(config-if)#
00:34:46: %PIM-5-DRCHG: DR change from neighbor 0.0.0.0 to 10.0.0.6 on interface FastEthernet1/0 (vrf default)
R3(config-if)#end

Now if we go back to R2 and run the sh ip pim interface command we should see be able to see the neighbor.

R2#sh ip pim interface

Address          Interface                Ver/     Nbr      Query  DR     DR
Mode   Count   Intvl    Prior
10.0.0.2         Serial1/0.1              v2/S    1          30       1        0.0.0.0
10.0.0.5         FastEthernet2/0      v2/S    1          30       1        10.0.0.6
R2#

If we then simulate a igmp join on R3 using the commands below and we debug ip igmp we can see R3 send a Report for 224.1.1.1 as below.

interface FastEthernet0/0
ip address 20.20.20.20 255.255.255.0
ip pim sparse-mode
ip igmp join-group 224.1.1.1

00:46:08: IGMP(0): Send v2 Report for 224.1.1.1 on FastEthernet0/0
00:46:08: IGMP(0): Received v2 Report on FastEthernet0/0 from 20.20.20.20 for 224.1.1.1
00:46:08: IGMP(0): Received Group record for group 224.1.1.1, mode 2 from 20.20.20.20 for 0 sources
00:46:08: IGMP(0): Updating EXCLUDE group timer for 224.1.1.1
00:46:08: IGMP(0): MRT Add/Update FastEthernet0/0 for (*,224.1.1.1) by 0
R3(config-if)#end

Now lets check the mroute table on R3

R3#sh ip mroute
IP Multicast Routing Table
(*, 224.1.1.1), 00:02:41/00:02:29, RP 2.2.2.2, flags: SJCL
Incoming interface: FastEthernet1/0, RPF nbr 10.0.0.5
Outgoing interface list:
FastEthernet0/0, Forward/Sparse, 00:02:41/00:02:29

the key things worth noting here are:- The *, G entry is using the RP 2.2.2.2.  R3 is expecting to receive the multicast traffic on interface FastEthernet1/0.  The RPF neighbor is 10.0.0.5.  Finally the outgoing interface ie where the IGMP join was seen is FastEthernet0/0.

Now let us simulate an a source for this multicast group on R1.

R1#ping
Protocol [ip]:
Target IP address: 224.1.1.1
Repeat count [1]: 10000
Datagram size [100]:
Timeout in seconds [2]:
Extended commands [n]: y
Interface [All]: FastEthernet2/1
Time to live [255]:
Source address:
Type of service [0]:
Set DF bit in IP header? [no]:
Validate reply data? [no]:
Data pattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]:
Type escape sequence to abort.
Sending 10000, 100-byte ICMP Echos to 224.1.1.1, timeout is 2 seconds:

If we turn on debugging on R2 using debug ip pim we see the following

00:57:45: PIM(0): Received v2 Register on Serial1/0.1 from 10.0.0.1
00:57:45:      for 10.10.10.10, group 224.1.1.1
00:57:45: PIM(0): RPF lookup failed to source 10.10.10.10
00:57:45: PIM(0): Send v2 Register-Stop to 10.0.0.1 for 10.10.10.10, group 224.1.1.1

As you can see the RPF failed.  This is because i have forgotten to advertise the 10.10.10.10 into IS-IS.

Lets see what happens once we advertise the 10.10.10.10 into IS-IS.

00:58:50: PIM(0): Received v2 Register on Serial1/0.1 from 10.0.0.1
00:58:50: PIM(0): Send v2 Register-Stop to 10.0.0.1 for 0.0.0.0, group 0.0.0.0

The source has registered successfully with the RP, however there are no subscribers at the moment so a register stop is generated.

If we examine the multicast routing table on R2 we see the following.

R2#sh ip mroute
(*, 224.1.1.1), 00:01:37/stopped, RP 2.2.2.2, flags: SP
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list: Null

(10.10.10.10, 224.1.1.1), 00:01:37/00:01:23, flags: P
Incoming interface: Serial1/0.1, RPF nbr 10.0.0.1
Outgoing interface list: Null

As you can see there is an S,G entry for the group however the outgoing interface list(OIL) is Null.

Now lets resimulate the igmp join in R3 and see the change in the multicast routing table.

R2#sh ip mroute
(*, 224.1.1.1), 00:05:20/00:02:50, RP 2.2.2.2, flags: S
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
FastEthernet2/0, Forward/Sparse, 00:00:40/00:02:50

(10.10.10.10, 224.1.1.1), 00:02:18/00:00:41, flags:
Incoming interface: Serial1/0.1, RPF nbr 10.0.0.1
Outgoing interface list:
FastEthernet2/0, Forward/Sparse, 00:00:40/00:02:49

As you can see the OIL now contains FastEthernet 2/0 which is the interface connected to R3.

Also if we check the mroute table on R3 we see the following.

R3#sh ip mroute
(*, 224.1.1.1), 00:03:41/00:00:44, RP 2.2.2.2, flags: SJCL
Incoming interface: FastEthernet1/0, RPF nbr 10.0.0.5
Outgoing interface list:
FastEthernet0/0, Forward/Sparse, 00:03:39/00:02:08

(10.10.10.10, 224.1.1.1), 00:02:16/00:00:44, flags: LJ
Incoming interface: Null, RPF nbr 10.0.0.10
Outgoing interface list:
FastEthernet0/0, Forward/Sparse, 00:02:16/00:02:08

As you can see the incoming interface for the multicast traffic is fastethernet 1/0.

Now lets bring a PIM neighborship between R1 and R3.

If we check the mroute table on R3 we can see that the R3 has joined the source tree.

R3#sh ip mroute
(*, 224.1.1.1), 00:09:49/00:00:39, RP 2.2.2.2, flags: SJCL
Incoming interface: FastEthernet1/0, RPF nbr 10.0.0.5
Outgoing interface list:
FastEthernet0/0, Forward/Sparse, 00:09:48/00:02:58

(10.10.10.10, 224.1.1.1), 00:02:21/00:00:39, flags: LJ
Incoming interface: FastEthernet1/1, RPF nbr 10.0.0.10
Outgoing interface list:
FastEthernet0/0, Forward/Sparse, 00:02:21/00:02:58

If we look at R1 we can see the source of the tree.

R1#sh ip mroute
(*, 224.1.1.1), 00:18:01/stopped, RP 2.2.2.2, flags: SPF
Incoming interface: Serial1/0.1, RPF nbr 10.0.0.2
Outgoing interface list: Null

(10.10.10.10, 224.1.1.1), 00:18:01/00:03:25, flags: FT
Incoming interface: FastEthernet2/1, RPF nbr 0.0.0.0
Outgoing interface list:
FastEthernet2/0, Forward/Sparse, 00:04:36/00:02:38
Serial1/0.1, Forward/Sparse, 00:10:18/00:03:26