The BGP distribution system for routes in the DFZ is insufficiently scalable. Small operators, individuals and mobile users are prevented from using provider-independent IP addresses because issues associated with memory consumption, processing and convergence time in the DFZ routers drive the cost above $6200 per year per announced prefix.
General statement of the solution:
Each Autonomous System (AS) deploys one or more TRRP Ingress Transit Routers (ITRs). An AS-interior default route leads from the BGP routers to the ITR. The ITR finds an Egress Tunnel Router (ETR) for the destination IP address via a DNS lookup. The ITR then tunnels the packet via GRE to the "best" ETR. The ETR, which must have an IP address within space announced via BGP, knows a local-scope route to the destination IP address and delivers the packet. Longer route prefixes are then withdrawn from BGP until a comfortable BGP table size is attained.
DFZ - Default-free Zone. Colloquially known as the Internet Backbone. That part of the Internet which does not have a default route leading to "everything else". A router in the DFZ knows a destination route for every IP address on the Internet.
AS - Autonomous System. Any entity which has received an AS number and participates in the DFZ via at least two Internet Service Providers.
GR Space - Globally Routable Space. The single CIDR block of IP addresses each AS is permitted to announce into the DFZ. All ETRs and DNS route servers must use IP addresses assigned from GR Space.
ETR - Egress Tunnel Router. This is a router capable of receiving an tunneled packet, extracting the original encapsulated packet and delivering the packet to its destination IP address. The ETR must know a regular route to the destination. Its IP address must be assigned from GR Space so that it is reachable without tunneling.
ITR - Ingress Tunnel Router. This is a device capable of looking up the ETRs associated with an IP address and encapsulating packets into GRE tunnel packets addressed to a reasonable choice of ETR. The ITR's IP address need not be assigned from GR Space.
DNS for Egress Route Maps:
Protocols like HTTP and DNS have proven themselves to be trivially scalable. They communicate knowledge directly from the authoratative origin to the clients who want it. Protocols which rely on distributing information across large numbers of servers like NNTP and BGP scale only with difficulty. Its so hard to maintain an NNTP Usenet server that few organizations do it any more and those spend hundreds of thousands of dollars on each server complex.
TRRP distributes routing records via DNS for each IP address. The records consist of one or more TXT records each of which contains one or more routing entries in the following format:
"pp,ii,route pp,ii,route ..."
"pp" is a hexadecimal encoded priority from 00 to ff. Smaller numbers have higher priority. Any number between 00 and ff may be used. The values chosen should be similar to the following so that the ITR may reorder them to take advantage of local routing knowledge.
00 = always prefer this route
40 = prefer this route
80 = normal
b0 = avoid this route
ff = use this route only as a last resort
|"ii" is a protocol identifier which specifies how
the packet will be encapsulated for transmission. The defined values for ii are:
DR: Send this packet without encapsulation. If no route is known, drop to a less preferred encapsulation. If there is no less preferred encapsulation, send a host unreachable. dr should generally be used for the route records for IP addresses inside GR Space so that the ITRs can cache the result rather than continuing to look it up. For the DR method, the route value is the digit "0" which has no meaning.
G4: Encapsulate this packet in a IPv4 GRE tunnel packet and transmit it to the IPv4 address offered in standard dotted-quad notation in the "route" value. e.g. "80,g4,192.168.100.1"
R4: Just like G4 but the "route" value is the 32-bit IP address in network byte order and base-64 encoded. The trailing '==' is dropped so that the address takes exactly 6 bytes instead of consuming up to 15 bytes the way a dotted quad address can. e.g. "80,r4,YWJjZA" R4 is the preferred way to specify the ETR as it is trivially decoded by the ITR software.
G6: Encapsulate this packet in a IPv6 GRE tunnel packet and transmit it to the IPv6 address offered in normal hexadecimal notation in the "route" value. e.g. "80,g6,2002:c0a8:8401::1"
R6: Just like G6 but the "route" value is the 128-bit IP address in network byte order and base-64 encoded. The trailing '==' is dropped so that the address takes exactly 22 bytes instead of taking up to 39 the way a normal hexadecimal encoded address can. e.g. "80,r6,YWJjZGVmZ2hpamtsbW5vcA" R6 is the preferred way to specify the ETR as it is trivially decoded by the ITR software.
In addition to these protocol identifiers, any two-character identifier begining with "x" is experimental. An ITR which is aware of experimental protocols may only use them if it has external knowledge that a particular ETR understands that x-protocol to mean the same thing as the local ITR does.
|If the ITR does not recognize the protocol identifier, it
must ignore it and move on to the next best one. This will allow new tunneling protocols
to be implemented.
The root DNS servers and every authoritative DNS server in the delegation path to the server which holds route information must listen and respond on only GR addresses in order to avoid a recursive loop.
A DNS resolver looking up Routes in the DNS must be able to follow NS records and CNAME records until it finds a TXT record containing the Route information. It must be able to make requests and understand replies via UDP.
An ITR making a recursive DNS request will do so via UDP. It should not ask again via TCP if the response is too large to fit in a UDP packet but it should attempt to extract as much usable routing from the UDP response packet as possible. An effort should be made to keep the routing entry small enough to fit in a UDP response. Those who fail to do so may find that their routes don't work as expected.
This offers a practical limit of not less than 15 ETRs for each IP address. If you have enough transit providers to need more, get an AS.
The DNS egress map entries are stored in two domains:
v4.trrp.arpa = route entries for IPv4 addresses, organized the same as the in-addr.arpa
v6.trrp.arpa = route entries for IPv6 addresses, organized the same as the ip6.arpa RDNS domain
Example use of a DNS Egress Map entry:
I send a packet to 220.127.116.11. The packet enters an ITR.
The ITR makes a DNS TXT request for 18.104.22.168.v4.trrp.arpa. It receives three TXT records
The ITR ignores the first TXT entry whose format does not make sense. It ignores the highest priority route (40) because it does not recognize type "bb." The ITR has both an IPv4 address and an IPv6 address and gives them equal weight, so it selects the next best priority route (80), encapsulates the packet into IPv4 GRE, and transmits it to the ETR.
The Ingress Tunnel Router (ITR)
The ITR is a GRE encapsulator. It may be placed anywhere on the network and need not have a GR Space IP address. It operates in one of two modes:
In passthrough mode, an ITR will encapsulate any packet for which the ITR does not know a specific direct route but for which the highest preference DNS Egress Map entry directs encapsulation. If there is no DNS Egress Map entry or the highest preference map entry is "dr", the ITR should send the unmodified packet towards the default route.
End user hosts or routers who do not have a full BGP feed should operate in passthrough mode.
In end-of-the-line mode, any packet which reaches the ITR does so because no traditional route is available. In this mode, the ITR must either encapsulate the packet or send a host unreachable message to the originator. An ITR operating in end-of-the-line mode must know all valid local routes and must filter and discard any packet selected for encapsulation which claims to be from a source for which it does not have a direct route. It must also filter and discard any such packet which claims to be from itself.
Where the ITR has a full BGP feed, or where it is fed via a default route from a router which has a full BGP feed, the ITR should operate in end-of-the-line mode.
In end-of-the-line mode, the ITR must reject encapsulation attempts for UDP DNS requests inside trrp.arpa. If it sees such a request, a routing loop or configuration error has occurred. It should drop the packet and respond to the sender with an ICMP destination unreachable: communication administratively prohibited message. UDP DNS replies must not be blocked in this manner.
When to look up the DNS Egress Map entry
The ITR should look up the map entry when it first sees a packet for the destination. If possible, it should hold packets for the destination in a buffer until a map is found.
An ITR operating in passthrough mode should retransmit the packets in direct mode rather than discard them if the packet buffer fills.
An ITR operating in end-of-the-line mode should send a source quench if a packet must be dropped from the buffer.
The ITR is generally expected to honor the TTL provided by the DNS server along with the TXT record containing the Route. It may, however, retain and continue using a cached entry for up to 60 seconds after the TTL expires while it attempts to refresh the record.
The ITR may at times receive a Destination Host/Network Unreachable message for the ETR to which a GRE-encapsulated packet was sent. Because the ITR no longer has a copy of the original packet, it can not propagate this information back to the originating host. Instead, it should do the following:
1. Verify the host unreachable condition. It may be spurious or a hacker trying to trick the ITR into refusing packets for the destination. Originate a single echo-request to the destination. Upon receiving a host unreachable which is positively correlated with the echo-request or waiting for a 5 second timeout, the ITR should accept that the ETR is unreachable.
2. Cache the unreachable condition for 5 minutes. If additional traffic is received during the final minute of the cache for which the ETR is a candidate, re-verify the unreachable condition with another echo-request and extend the cache for an additional 5 minutes.
3. Immediately expire and re-lookup the destination ETR via the DNS Egress Map servers. Select a new best path, discarding any ETRs which are cached as unreachable. To the extent possible, hold data packets for the destination while the lookup is performed. Send source-quenches if packets must be dropped.
4. If all possible ETRs are cached as unreachable and the ITR is operating in passthrough mode or it has a DR route available, it should retransmit the packet directly without encoding.
5. If all possible ETRs are cached as unreachable and the ITR is operating in end-of-the-line mode, it should send a host unreachable message to the source and discard the packet.
The ITR must set the Don't Fragment (DF) bit for outgoing GRE packets and must cache any Fragmentation Needed messages it receives from routers on the way to the ETRs.
If a packet is too large to fit in a GRE packet intended for a particular ETR and the DF bit is not set, the ITR should fragment the original packet and then encapsulate it in two independent and complete GRE packets.
If a packet is too large to fit in a GRE packet intended for a particular ETR and the DF bit IS set, the ITR should discard the packet and return a Fragmentation Needed message to the source.
The ITR should proactively attempt to adjust the TCP MSS in any encapsulated SYN packets such that the subsequent TCP packets will be small enough to fit into a GRE packet which can reach the ETR without fragmentation. While this is not supposed to be necessary, the reality on the ground is that ignorant firewall administrators have broadly broken path MTU discovery.
The ITR must copy the source packet's TTL field into the GRE packet's TTL field and decrement the counter.
If the ITR is not explicitly configured with the location of a DNS resolver and does not have one of its own built in, it may make an initial UDP anycast request to [fill in IP address here] delivered to the default router requesting the NS record for ".". The unicast address from which the response is received should be accepted and cached as the location of the DNS resolver. If no response is received, it may send a multicast packet to [fill in IP address here] delivered to the local LAN with the same rules. If it still can not find a DNS server, the ITR can not function.
The ITR may alter the the priority of protocol entries offered in the DNS Egress Map entry based on local knowledge before selecting the optimal way to reach the remote IP address. They may, for example, use the AS path length of the ETR to alter the priorities of the offered ETRs. While specifics for such alteration are not offered in this document, implementors are advised to make only mild changes, generally limiting them to priority shifts of 0x20 or less. The destination network knows better than you do how good or lousy their low priority reachability methods are.
A host implementing an ITR may also implement an ETR but it is not required to do so. An ITR must not assume that an ITR which encapsulated a packet is a valid ETR for that destination. Only ETRs looked up via DNS or shared from a trusted ITR which has performed the requisite DNS lookup may be used. Such sharing is not defined in this document.
Multicast routing is outside the scope of TRRP. The ITR should not encapsulate packets with a multicast destination address. The ITR should encapsulate packets normally where the packet has a multicast source address but a unicast destination address.
An ITR will not work when placed behind a firewall which can not pass GRE packets. This means that an ITR can not operate on an individual host behind a NAT firewall. Folks implementing originating host level ITR should take care to avoid enabling the ITR when configured solely with RFC1918 addresses.
The ITR must use GRE tunnel key "1", aka 0.0.0.1. Some ETRs require the tunnel key to successfully differentiate which tunnel to delilver the GRE packet to.
The Egress Tunnel Router (ETR)
The ETR is a dumb old multipoint GRE endpoint. The GRE tunnel "source" address is the ETR's regular IP address. The GRE interface address is also inherited from the regular IP address. And that's it: you now have an ETR capable of receiving and decapsulating packets from the ITR intended for your local network. The ETR performs no special lookups and needs only know routes to locally valid IP addresses.
The ETR's IP address must be assigned from an address in GR Space. That's because the encoded GRE packets must be able to reach it directly. While any address directly routed in the DFZ is useable, its expected that in TRRP's endgame only an ISP's GR Space will meet that requirement.
The ETR should filter and discard packets intended for destinations off your network. This will prevent your peers from pirating your bandwidth by designating your ETR as a destination for their IP addresses.
The ETR should copy the TTL from the GRE packet back in to the original packet. Nothing will blow up if it doesn't, but it will cause traceroute and other diagnostic tools to misbehave.
Cisco IOS configuration example:
interface Tunnel0 description TRRP Egress Tunnel Router no ip address tunnel source FastEthernet0/0 tunnel mode gre multipoint tunnel key 1
Linux configuration example:
ip tunnel add trrpetr mode gre local 192.168.100.1 key 1 ip link set trrpetr up ip addr add 10.0.0.1/32 dev trrpetr
Additional Notes and Optional Features