Thursday, November 6, 2008

Internet Indirection

The theme for today's discussion is Internet indirection! Middleware like firewalls and network address translation boxes have proliferated in the Internet but are often scorned upon as breaking the end-to-end design argument. Specifically, a firewall filters packets by explicitly looking at packets that are not addressed to it (and depending on the firewall looking as far up as the data payload of the higher level protocol). A NAT on the other hand might be considered a worse or less offender because it actually rewrites the packet header as is necessary (changing the source address or destination address).

Two papers address possible alternative Internet architectures that allow, or even encourage, indirect processing of packets. The two papers are, "Middleboxes No Longer Considered Harmful" [MIDDLE] and "Internet Indirection Infrastructure" [i3].

The [MIDDLE] paper can actually be thought of as an extension to [i3] where they focus exactly on allowing the notion of middleboxes like firewalls and NATs. In fact, in many ways it seems like [i3] subsumes [MIDDLE] and provides the more generic architecture. In fact, after reading [i3], I wasn't horribly impressed with [MIDDLE]!

The [i3] work, however, I found very elegant and appealing. The principle of i3 is rendezvous routing. Essentially, a server can place a public trigger on the Internet and end hosts can try and send data to the id of that trigger which then performs routing lookups to determine the ultimate destination IP. The indirection is the fact that you send to these identifiers rather than sending to an IP address and the identifiers are stored uniquely and can be updated as the server sees fit (as it moves, for example). Great idea, in my opinion.

Of course, how much different is this then requiring that every single send of data require a DNS lookup? As long as I can update my DNS name server every time I change my name and have it resolve my mapping with a TTL of 0, then it seems like I can more or less accomplish the same thing. The crux is how efficiently this could be performed for something the size of the entire Internet. The basis of DNS is that caching provides performance, and in the absence of caching using a lookup mechanism like DHT might be the absolute right way to go.

To allow middleware the i3 has a notion of a stack of identifiers. This allows sending data to a certain identifier which itself has a stack of identifiers that the packet must traverse, in a way, very much like source routing. Of course, now, because each IP packet is addressed to the right node, it can do whatever transformations of the packet it sees fit. The other added benefit is this model is that you don't need the middleware to exist physical in between an end host and the rest of the Internet! The authors of [MIDDLE] discuss the advantages of this in depth, although they also mention that the physical separation can some times be advantageous for security.

The identifiers in i3 were interesting because, while they did provide for this indirection, they also make it clear that identifiers should be picked in a way which provides geographical/network proximity. In fact, they even mention making one part of the identifier fixed for geography and the other part vary (the "true" identifier). This sounds strikingly similar to a hierarchical address ... (IPv6?).
   
In terms of performance, [MIDDLE] seemed to have lots of unfortunate overheads. I was rather disappointed with their discussion on packet overheads do to their extra header ... obviously there will be some packet overheads! Essentially, overlay solutions like these might suffer from extra round trip latencies, extra processing time (to figure out, for example, where to route to next), etc. Given these, it make sense that we haven't seen these take over in the Interent ... at least not yet. Of course, there are lots of companies trying to use the overlay and P2P like model to multicast stream live media.  

1 comment:

Randy H. Katz said...

I think the ideas in i3 and middleboxes are developed independently and at about the same time. Middle focuses more on the separation of names and IP addresses with a mechanism that (delegation) that permits the construction of middleboxes. I3 focuses much more on the mechanism for name/address separation, and I agree that the idea of triggers is elegant. There are many many implementation issues, as the paper makes clear.