Anycast Segments in MPLS based Segment Routing
Individual ContributorBangaloreKA560103Indiapushpasis.ietf@gmail.comIndividual Contributorhannes@gredler.atCisco Systems, Inc.BrusselsBEcfilsfil@cisco.comCisco Systems, Inc.Via Del Serafico, 200Rome00142Italysprevidi@cisco.comOrangebruno.decraene@orange.comDeutsche TelekomHammer Str. 216-226Muenster48153DEMartin.Horneffer@telekom.deSPRING Working GroupSPRINGSegment RoutingAnycast SegmentsInstead of forwarding to a specific device or to all devices in a group,
anycast addresses, let network devices forward a packet to (or steer it
through) one or more topologically nearest devices in a specific group of
network devices. The use of anycast addresses has been extended to the
Segment Routing (SR) network, wherein a group of SR-capable devices can
represent a anycast address, by having the same Segment Routing Global Block
(SRGB) provisioned on all the devices and each one of them advertising the
same anycast prefix segment (or Anycast SID).This document describes a proposal for implementing anycast prefix
segments in a MPLS-based SR network, without the need to have the same SRGB
block (label ranges) provisioned across all the member devices in the group.
Each node can be provisioned with a separate SRGB from the label range
supported by the specfic hardware platform.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119. Anycast is a network addressing scheme and routing methodology in which
packets from a single source device are forwarded to the topologically nearest
node in a group of potential receiving devices, all identified by the same
anycast address. There are various useful usecases of anycast addresses, and
discussion of the same are outside the scope of this document. extended the use of
anycast addresses to SR networks. An operator may combine a group of
SR-enabled nodes to form a anycast group, by picking a anycast address
and a segment identifier (hereon referred to as SID) to represent the group,
and then provisioning all the nodes with the same address and SID. Once
provisioned, each device in the group advertises the corresponding anycast
address in its IGP link-state advertisements along with the SID provisioned.
Source devices on receiving such anycast prefix segment advertisements,
finds out the topologically nearest device that originated the anycast segment
and forwards packets destined to the same on the shortest-path to the nearest
device. requires all devices
in a given anycast group to implement the exact same SRGB block(s). This
requirement will always be met in SR network deployed over IPV6 forwarding
plane . For SR
over MPLS dataplane ,
while this is a simple (and hence more desirable) solution, the same may not
be possible in a multi-vendor networks deploying devices with varying hardware
capabilities. In MPLS-based SR deployments, the segments on a given source router
are actually mapped to a MPLS labels allocated from the local label pool
carved out by the device for accomodating the SRGB. In multi-vendor
deployments with various types of devices deployed in the same network
topology, such a anycast group may contain a good combination of devices
from different vendors and have different internal hardware capabilities.
In such environments it is not sufficient to assume that all the devices in
a anycast group will be able to allocate exactly the same range of labels
for implementing the SRGB. In reality, getting a common range of labels
among all the various vendors may not be feasible.
This documents provides mechanisms to implement anycast segments with
any kind of device in a multi-vendor network deployment without requiring to
provision the same exact range of labels for SRGB on all the devices. To better illustrate the problem let us consider an example topology
using anycast segments as shown in
below. In above, there are two groups of transit
devices. Group A consists of devices {A1, A2, A3 and A4}. They are all provisioned
with the anycast address 192.1.1.1/32 and the anycast SID 100. Similarly, group B
consists of devices {B1, B2, B3 and B4} and are all provisioned with the anycast
address 192.1.1.2/32, anycast SID 200. In the above network topology, each PE
device is connected to two routers in each of the groups A and B. Following are all the possible ECMP paths between the various pairs of PE
devices.
P1: via {R1, A1, A3, R3} P2: via {R1, A1, A4, R3} P3: via {R1, A2, A3, R3} P4: via {R1, A2, A4, R3} P5: via {R2, B1, B3, R4} P6: via {R2, B1, B4, R4} P7: via {R2, B2, B3, R4} P8: via {R2, B2, B4, R4} As seen above, there is always eight ECMP paths between each of pair of
PE devices. The network operator may not wish to utilize all possible ECMP paths
for all possible types of traffic flowing between a given pair of PE devices. It
may be more useful for use paths P1, P2, P3 and P4 for certain types of traffic
and use paths P5, P6, P7 and P8 for all other types of traffic between the same
PE devices. If so desired, operators may use these anycast groups A and B and the
corresponding anycast segment to impose a segment-list (refer to
) to forward the respective
traffic flows over the desired specific paths as shown below.
below depicts a expanded view of the paths
via group A. The range labels allocated for SRGB on each of the devices in
group A are also mentioned in this diagram. In the above topology, if device PE1 (or PE2) requires to send a packet
to the device PE3 (or PE4) it needs to encapsulate the packet in a MPLS
payload with the following stack of labels.
Label allocated R1 for anycast SID 100 (outer label) Label allocated by the nearest router in group A for SID 30 (for destination PE3)
While the first label is easy to compute, in this case since there are more than
one topologically nearest devices (A1 and A2), unless A1 and A2 implement same exact
SRGB, determining the second label is impossible. In all likeness, devices A1 and A2
may be devices from different hardware vendors and it may not implement the same exact
SRGB label ranges. In such cases, separate labels are allocated by A1 and A2 (1030
and 2030 respectively, in the above example). Hence, PE1 (or PE2) cannot compute an
appropriate label stack to steer the packet exclusively through the group A devices.
Same holds true for devices PE3 and PE4 when trying to send a packet to PE1 or PE2. This document introduces the term 'Common-Anycast SRGB' (hereafter referred
to as the CA-SRGB) to define the SRGB implemented by the majority of the devices
in the network, that are participating in one or more anycast segments. Each
device MUST implement provisions to let the operators assign the CA-SRGB on the
device. Each vendor implementation MUST implement provisions to configure the
CA-SRGB at all configuration levels (per-routing-instance/per-protocol/per-topology
etc) wherein provisions to configure the local SRGB label ranges has also been
implemented. Essentially, for each SRGB configured on the device, vendor
implementations MUST allow configuring a corresponding CA-SRGB value. For each configuration level (per-routing-instance/per-protocol/per-topology
etc)supported, the operator MUST set the same exact CA-SRGB on all devices across the
entire IGP domain (including different IS-IS levels and OSPF areas). This ensures
the proposal specified in works flawlessly
across all devices in any multi-vendor network deployment. However assigning the CA-SRGB (for a given routing-instance/protocol/topology
etc.) on the device, does not mean the label ranges allocated by the device for
the corresponding SRGB has to belong to the CA-SRGB defined. The device may
dynamically allocate the corresponding SRGB label ranges, or allocate the range
provisioned by the operator, through an appropriate separate configuration (please
refer to for more details). For devices that has the local SRGB to be exact same as the 'CA-SRGB' applicable
for the entire network, operators need not explictly set the corresponding CA-SRGB
values. In such case, the vendor implementations MUST assume the local SRGB values
to be the corresponding CA-SRGB values defined on the specific device.If the CA-SRGB defined on a device does not absolutely match the corresponding
SRGB label ranges allocated (or provisioned) on the same device (i.e. the CA-SRGB
is not an exact copy of the corresponding SRGB label ranges), and the device is
provisoned with one or more anycast prefix segments, the device MUST implement all
the additional functionalities specified in ,
and . On devices,
where the SRGB label ranges is an exact copy of the corresponding CA-SRGB defined,
the device need not implement these additional functionalities (
, and
). For each anycast prefix segment, this document also defines a 'Common Anycast
Prefix Segment Label' (hereafter referred to CAPSL). The value of this label is
derived by applying the SID index associated with the prefix segment as an offset
to the CA-SRGB configured on the specific device. Since the operator MUST configure
the same CA-SRGB values on all devices in the IGP domain, all devices shall associate
the same CAPSL label value for a given anycast prefix segment.
below shows the CAPSL labels allocated by any
device for the various prefix segments found in , with
CA-SRGB set to 3000-4000 on all devices.SIDCA-SRGB RangeCAPSL Value102000-30002010202000-30002020302000-30002030402000-300020401002000-30002100 This document also introduces the term 'Anycast Prefix Segment Label' (hereafter
referred to as APSL) to define the label allocated by a device to advertise reachability
for the specific anycast prefix segment. The value of this label is derived by applying
the SID index associated with the anycast prefix segment as an offset to the SRGB of the
specific device. below shows the labels allocated by the
various devices in for the anycast prefix segment with SID
100.Anycast-SIDDeviceSRGBAPSL-Label100R17000-80007100100A11000-20001100100A22000-30002100100A33000-40003100100A44000-50004100100R36000-70006100 A MPLS device that tries to encapsulate any kind of traffic into a
SR-based MPLS payload (hereafter referred to as the ingress device)
and steer it through a series of SR adjacency and/or unicast/anycast
prefix segments, needs to compute an appropriate stack of MPLS labels and
put it in the outgoing packet. Alternatively, in a SDN environment, the
SDN controller may need to compute the label stack and install it on the
ingress device. However in both cases, as illustrated in ,
for a given ingress device (e.g. PE1 or PE2), there maybe multiple topologically
nearest devices in a specific anycast group (e.g. A1 and A2), even through
there is only out-going link from the source device(e.g. PE1->R1 or PE2-R1).
In such case, when the ingress device (or the SDN controller) wants to steer
a packet through the anycast group A, it can use the anycast segment label
advertised by the downstream neighbor of the ingress device for the specific
anycast prefix segment. Since the packet may reach any one of the multiple
devices in the group and each of them may have a separate SRGB label range,
choosing the MPLS label for the next segment providing reachability to the
final destination. Also, since the packet steered through a anycast segment
can reach of any of the member device in the anycast group, it is sufficient
to assume that the ingress (or the controller) cannot place an adjacency
segment immediately after a anycast segment in the outgoing packet. This document proposes the ingress device (or the SDN controller) to
derive the label for a prefix segment that immediately follows a given anycast
segment, to be the CAPSL label associated with the corresponding SID index
(refer to ). Note the prefix segment
immediately following the given anycast segment may itself be another anycast
segment. The ingress (or the SDN controller) MUST follow the algorithm below to compute
the label-stack, that it must use to steer a packet through a list of SR segments.
Set previous_segment ==> NONE.Set label_stack ==> {EMPTY}.For [all segment in Segment_List]
If {segment.type == Adjacency_Segment}
Set label ==> segment.Adjacency_Segment_Label.Else
If {previous_segment.type == Anycast_Prefix_Segment}
Set label ==> CAPSL_Label(segment.SID_index).Else
Set label ==> segment.Prefix_Segment_Label.Add label to label_stack.Set previous_segment ==> segment. When a MPLS packet on the wire first hits a device, the forwarding hardware
reads the topmost label in the MPSL header and looks up the default label lookup
table associated with the interface on which the label has been received. This
table is generally called LFIB. The range of labels found in the LFIB constitutes
the default label space. This document introduces a separate virtual label lookup table (hereafter
referred to as Virtual LFIB or V-LFIB), that represents a label space which is
also separate from the actual label space represented by the default LFIB. The
label value may be present in both the default and Virtual LFIB. However the
forwarding semantics associated with the label under the default and Virtual
LFIB may not be same. Following are the fields of a typical entry of this
table.
CAPSL-Label: The CAPSL label value derived from the SID index associated
with a given prefix segment originated by another device in the same network.
Refer to for more details. This is also
the key field for this table.Forwarding Semantics: This is once again one or more tuples of following
items.
Outgoing-Label: The label(s) allocated by the neighbor device(s) on the
shortest-path to the topologically nearest originator(s) of the prefix
segment.Outgoing-link: The link(s) connecting the device to the neighbor device(s)
on the shortest path to the topologically nearest originator(s) of the prefix
segment. This document proposes that, any device, when provisioned with one or more
anycast prefix segment (address and SID), and the CA-SRGB defined by the operator
is not an exact copy of the corresponding SRGB label ranges allocated by the device,
it MUST create a Virtual LFIB table. Such a device MUST add an entry in the Virtual LFIB for each unicast and anycast
prefix segments learnt from a remote device, if and only if the same prefix has
not been provisioned on the device. The device SHOULD NOT add an entry for any
of the Anycast or Node prefix segments that it has advertised itself. However if
the device has learnt any anycast prefix segment from a remote device, and the
same is not provisioned on this device, the device MUST include the same in the
Virtual LFIB table.In cases where a prefix segment is reachable via multiple shortest paths on a
given device, the corresponding entry for the prefix SID MUST have as many
forwarding entries in the Virtual LFIB table as the number of shortest-paths
found for the corresponding prefix on the device. below shows how the Virtual LFIB table on each of
devices in group A should look like. Please note that some of the prefix segments
has multiple forwarding semantics associated with them. For example, on device A1,
the prefix SID 10 (originated by PE3) is reachable through its neighbors A3 and A4.
And as per the SRGB advertised by A3 and A4, the labels allocated by A3 and A4 are
3030 and 4030 respectively. Hence A1 has added two forwarding entries for the
prefix SID 30 in its Virtual LFIB table. Please note that node A2 has not created a Virtual LFIB table since the
CA-SRGB (2000-3000) is identical to the SRGB provisioned on it.Also please note that none of the devices in the anycast group have included
the anycast SID 100 in the Virtual LFIB table, since the same has already been
provisioned on these devices. When a device receives a MPLS packet with the anycast segment label associated
with one of the anycast prefix segments provisioned on the same device, and the
CA-SRGB defined by the operator is not an exact copy of the corresponding SRGB label
ranges allocated by it, it MUST use the Virtual LFIB table to lookup the next label
that follows the anycast segment label in the stack of labels found in the MPLS header.
Refer to for more details. Following forwarding instructions MUST be installed in the MPLS data-plane for
each entry in the Virtual LFIB entry.
If the label at the top of the stack matches any of the prefix SIDs in the
Virtual LFIB table,
If there are multiple forwarding tuples associated with matching table
entry,
Select one forwarding tuple. (Criteria to select one is outside the scope
of this document.)Else,
Select the single forwarding tuple available.Replace the next label (should be a CAPSL label) found at top of the MPLS
label stack in the incoming packet, with the 'Outgoing-label' from the selected
forwarding tuple.Forward the modified packet onto the 'Outgoing-link' as specified in the
selected forwarding tuple.If the prefix SID is another anycast segment,
Ensure the next label lookup is launched again on the Virtual LFIB table.Else,
Ensure the next label lookup is launched on the default LFIB table. Like unicast prefix segments, anycast prefix segments SHOULD be advertised
in IGP Link-state advertsements using IGP protocol extension for SR specified
in ,
and
. This document
does not propose any protocol extension for advertising anycast prefix segments. However when advertising the anycast segments, and the CA-SRGB defined by the
operator is not an exact copy of the corresponding SRGB label ranges allocated by
the originating device, it MUST set the corresponding P-Flag(No-PHP) in ISIS
Prefix-SID SubTLV and/or the NP-Flag (No-PHP) in OSPFv2 and OSPFv3 Prefix-SID
SubTLV to 1 and the E-Flag in the same SubTLVs to 0. Please refer to following
for more details on usage of these flags.
ISIS Prefix-SID SubTLVOSPFv2 Prefix-SID SubTLVOSPFv3 Prefix-SID SubTLV
The proposal above, ensures that a MPLS packet sent to (or taking transit through)
a given anycast group, when reaching at a topologically nearest device in the
group where CA-SRGB does not match SRGB provisioned on it, always arrives with the
APSL-label that is derived from the device's SRGB, and the SID associated with
the corresponding anycast prefix segment. Note in the above topology, assuming
domain-wide CA-SRGB is set to (2000-3000) on all nodes, while nodes A1, A3 and
A4 will advertise the SID 100 with P-Flag(No-PHP) set to 1, node A2 will advertise
the same anycast prefix SID with P-Flag unset. This is because on node A1 the
domain-wide CA-SRGB is identical to the local SRGB provisioned on A2. In , when PE1 or PE2 intends to steer a packet
destined for PE3 or PE4, through the anycast group A (SID 100), it needs to forward the
packet to R1 (SRGB:7000-8000), after putting the label 7100 (derived from R1's SRGB),
at top of the label stack in the MPLS header. However when the same packet is forwarded
to A1 (one of the topologicaly nearest devices in group A), R1 shall not POP (or remove) the
label 7100. Instead R1 shall replace it with the label 1100 while forwarding to A1.
While forwarding to A2, since A2 would have advertised the anycast SID 100 with P-Flag
(No-PHP) unset, R1 shall POP the incoming label 7100 before forwarding it to R1. The proposal specified in , ensures that a MPLS
packet destined to (or steered via) a anycast prefix segment always arrives at the
nearest device in the anycast group with a label derived from the device's SRGB and
the SID associated with the corresponding anycast prefix segment, as the top-most label
label stack in its MPLS header. If this label is also the bottom-most label (S=1),
it means packet has been destined to the anycast segment, and should be consumed by
the local device. If the label is not the bottom-most label (S=0), the packet must
be forwarded to the next segment, for which the next label in the stack should be
consulted. However specifies that the next label in
such case, shall be a label belonging to the CA-SRGB defined by the operator, derived
from the SID associated with the next segment. Since the CAPSL label for the SID
index associated with a prefix segment may directly collide with another label in the
default LFIB table, also proposed to have a Virtual
LFIB table to provide a separate label-space for looking up the next label. This document specifies that a device provisioned with a given prefix segment
index MUST implement following forwarding semantics for the anycast segment label
(refer to ) associated with the anycast prefix segment,
if the CA-SRGB label ranges defined is not an exact copy of the corresponding SRGB
label range(s) locally allocated/provisioned on the device.
If the label at the top the stack is a anycast segment label, and the CA-SRGB
defined is not an exact copy of the corresponding SRGB label range(s),
Pop the label.If bottom-most label in the stack (S=1),
Send it to host stack for local consumption, as usual. Else if not the bottom-most label in the stack (S=0),
Set the Virtual LFIB table as the lookup table for the next label lookup.Launch a lookup for the next label in the stack (should be a CAPSL label).Else
Lookup the label in the default LFIB table as usual. below ilustrate how SR-based MPLS packets destined for
PE3 and sourced by PE1 are expected to flow through when PE1 encapsulates the packet with
an appropriate label stack to steer it through group A devices onlyMany many thanks to Shraddha Hegde, Eric Rosen, Chris Bowers and
Stephane Litkowski for their valuable inputs.N/A. - No protocol changes are proposed in this document.This document does not introduce any change in any of the
protocol specifications.