Summary of Routing Architectures discussed in the RRG
SECOND DRAFT

This document attempts to look at abstract architectural issues associated with the BGP routing table growth problem while avoiding engineering details as requested by the Routing Research Group chair.

 

Item 1: Routing table size problem, root cause:

The root cause of the routing table growth problem is that we're using the same element (IP address) to express three distinct concepts:

  1. the globally unique identity (GUID) needed by the originating client system to find the destination server,
  2. a component of the session identity (SID) used by the layer 4 and higher protocols to keep track of communications with other hosts, and
  3. the node's present location or locations (LOC) within the network topography.

Changing a host's GUID is often hard. Client systems which provide no services to the network can change because nothing needs to find them but changing a server's identity requires changes in many other hosts' references which are unwieldy or impractical to make.

Changing a SID terminates the communications session. This makes address changes unsuitable for use in multihoming, even where the multihoming is trivial, such as a home user with two DSLs.

As a result, the GUID strongly resists all change while the SID resists change that must complete shorter than a typical communication session. Because the LOC is bound to the GUID and SID, the LOC strongly resists change even when you desire to move the affected host to a new location in the network.

Also, hosts find it very difficult to present two LOCs from separate parts of the topography graph. In addition to the problems above, packets from such hosts would need to be "policy routed" by the source address until they reach a transit-free network level in order to preserve security-required source filtering, a function not inherent to modern dynamic routing protocols. As a result, multihomed systems generally connect distant portions of the network topography instead of presenting a small number of aggretatable stubs.

Thus we end up with a great deal of duplication where more than one LOC holds the same spot in the topography (disaggregation and fragmentation) and trivial multihoming forces us into a general graph topography with very limited hierarchical aggregation of the LOCations that routers route by.

 

Item 2: Solution strategies

Strategy A.

Do local routing by the address (which functions as GUID, SID component and local LOC) but have each packet flow through an encoder which attaches a second Remote LOC (RLOC) before the packet enters the internetwork core. Do not route by the address in the core; route by RLOC. Limit RLOC routing in the core so that only service providers (ISPs) with significant interconnection have their own RLOCs. Fewer than 10,000 such "core ISPs" exist today and the number is growing much more slowly than the routing table overall. Once the packet reaches the network identified by the RLOC, local routing by address takes over for final delivery. Distribute RLOCs through the core via a typical distance-vector or link-state routing protocol.

Variants include:

A1a. Each core ISP has one RLOC. The RLOC's existence and reachability is flood-propagated to the rest of the core.

A1b. Each core ISP has a small number of RLOCs for traffic engineering (TE). The RLOCs' existence and reachability is flood-propagated to the rest of the core.

A1c. Each core ISP has one aggregated set of RLOCs which it may hierarchically assign to customers downstream and/or disaggregate for TE. The aggregated RLOC's existence and reachability is flood-propagated to the rest of the core.

Methods for mapping the address to one or more RLOCs include:

A2a. GUIDs statically mapped to each RLOC are periodically pushed towards a central or distributed registry. The full list is periodically downloaded to the encoders which add RLOCs to the packets.

A2b. GUIDs dynamically mapped to each RLOC are pushed towards a central or distributed registry as they change. The registry  pushes all incremental changes in near-real time to all encoders which add RLOCs to the packets.

A2c. GUIDs dynamically mapped to each RLOC are pushed towards a central or distributed registry as they change. Encoders request and briefly cache individual mappings from the registry as needed.

Failure handling approaches include:

Link failures in the Internet core cause the RLOCs to be rerouted with no change to the address-RLOC map.

A3a. RLOC encoders detect when particular RLOCs are no longer reachable at all and fall back on secondary RLOCs for a particular address. Encoders rely on active failure messages from some system in the RLOC-specified network to indicate that a host is no longer available via that RLOC, causing them to fall back on secondary RLOCs for that host.

A3b. Link failures which prevent parts of the RLOC's network from reaching a destination host it serves cause a dynamic change to the address-RLOC map, depreferencing the affected RLOC.

Compatibility approaches include:

A4a. New IP protocol. Not compatible with IPv4 and IPv6.

A4b. Modified IP protocol. Not compatible with deployed IPv4 and IPv6.

A4c. Standard IPv4 and IPv6 packets are encapsulated in a tunnel packet while they transit the Internet core. Path-MTU issues are addressed by setting an Internet-wide maximum packet size enforced by the encoders and assuring that all core links support that size.

A4d. Standard IPv4 and IPv6 packets are encapsulated in a tunnel packet while they transit the Internet core. Path-MTU issues are addressed by returning packets which breach the MTU while in the core back to the encoder who must act as a proxy by returning a sensible packet-too-big message to the originating host.

A4e. The IPv6 address space is partitioned into end-user address space and Internet core address space. The address to RLOC map is symmetric. Part of the IPv6 end-user address is swapped for the RLOC when the packet enters the Internet core and then restored when it leaves the Internet core. IPv4 is abandoned.

 

Strategy B.

Assign heirarchically aggregatable LOCs to every host. Assign multiple LOCs to each host such that in the network topography hosts appear as stubs in multiple locations instead of forming distant connections in the graph. Assign one aggregated set of LOCs to each core ISP where a core ISP is one which has at least half a dozen major transit or peering links. Flood-propagate the aggregated LOC's existence and reachability to the rest of the core.

Having reduced the network topology to something relatively close to a hierarchy, perform plain old hierarchical aggregation on the LOCs. Add and remove LOCs to each host dynamically during operation as needed to reflect changes in the nearby network hierarchies.

Attach source and destination LOCs when the packet leaves the host. Route by first source then destination LOC: move up the source network hierarchy until you can move laterally toward the destination LOC in a permissioned manner.

LOCs dynamically mapped to each host are pushed towards a distributed registry as they change. Hosts request and temporarily cache individual mappings from the registry as needed.

During a link failure, the host abandons the impacted LOC since it is no longer reachable that way. If the link failure is distant enough that the host still has a viable path within the same LOC hierarchy, the host may acquire a new LOC representative of its new path through the nearby network.

Variants include:

B1a. Each host has a single numeric GUID to which the LOCs are attached. This GUID is used by the layer-4 and higher protocols to compose the SID..

B1b. Each service provided by a host has a globally unique, hierarchic character-string GUID to which the LOCs are attached. Clients initiating communication with that service negotiate a numeric SID which is unique only within the scope of that service.  

 

Strategy C.

Suppress distant routes by aggregating them into sets expected to be available in a given direction. Because LOC reachability info is not flooded, the routing tables each router must deal with are relatively small.

Appears intractable. Approach largely abandoned.

Variants include:

C1. Geographic aggregation. All nodes within some geographic boundary are assigned the same LOC. Routers move packets to any adjacent router deemed to be "closer" to the LOC in question.

Geoag has been shown to have uncorrectable theft-of-service anomalies in networks as small as 8 autonomous systems and two geographic areas.

 

Addendum:

Item 1. Additional causative  factors worth mentioning:

Because even a distant route may present a better priority in one coreward direction than another, it's not possible to suppress distant routes. As a result, every router in the core must carry all routes.