bchdump - Benjamin Charles Hindman dumps words about networks: September 2008

Tuesday, September 30, 2008

Wireless: Media Access Protocols and TCP Performance

This post covers two different papers:

MACAW: A Media Access Protocol for Wireless LAN's [MACAW]
A Comparison of Mechanisms for Improving TCP Performance over Wireless Links [COMPARE]

The authors of [MACAW], motivated by the proliferation of wireless devices, sought a way to engineer a media access protocol that performed better than existing multiple access congestion avoidance protocols (versus carrier sense protocols).

The authors of [COMPARE] were interested in how TCP could be tuned (or modified) to perform better over the broadcast medium of wireless links (i.e. assuming some fixed media access protocol).

It is interesting to see how, once again, TCP has become such a dominating network transport protocol that we are willing to sacrifice possible performance in order to augment it (as little as possible) to operate over emerging network technologies. In fact, it seems very likely that a new network technology (link-layer technology or transport-layer technology), even one that outperforms existing network technologies, that imposes a seriously difficult transition path for TCP will have a hard time being adopted. The authors of [COMPARE] do mention one new such technology, selective repeat protocol (SRP), but claim was not conclusive as a better alternative to TCP (although selective acknowledgment + TCP was).

It is not clear exactly what is the right level for abstractions and optimizations in the networking stack. Perhaps it is time to investigate how applications might actually perform better if they had a better understanding of the actual mediums that they were running on? Perhaps the applications themselves might better be able to take advantage of wireless networks by changing, for example, their sending patterns? Operating systems research has shown how applications that can take advantage of lower-level resources can often outperform those applications that rely on layers of abstractions (exokernel, scheduler activations, acpm, etc). Perhaps there is an analogue in the networking world?

An interesting aspect of [MACAW] was the investigation of the backoff counter. The authors conclude from their investigation that a backoff counter is needed for each station that a node is attempting to communicate with. In addition, all of the nodes communicating with that station need the same backoff counter in order to ensure fairness. (Digression: While I don't know enough about 802.11(a,b,g), I was not under the assumption that a station simultaneously communicated with more than one node (station), in which case, I'm curious how sophisticated the backoff mechanism needs to be.) I'm curious, given that I'm a frequent of coffee shops, how many of my "delayed" connections are due to backoff congestion, packet interference loss, versus TCP congestion control. (You know, when the network sometimes just stops responding and it is best to actually re-submit the url request ...)

(Hmm, I'm getting used to a new style of writing blog posts, less summary and more interesting thoughts from the paper to encourage feedback/conversation with others ...)

Monday, September 22, 2008

Scaling Internet Routers

In two papers from McKeown and others, "A Fast Switched Backplane for a Gigabit Switched Router" and "Scaling Internet Routers Using Optics", the authors explain how router technology can be scaled well beyond its current needs and close to its theoretic limits. One of the papers discusses how to scale routers for gigabit while the other one shows an impressive design for 100 terabit!

McKeown discusses how, in most routers then (circa 1995), datagrams are transmitted out of a first-come-first-served (FIFO) queue (which is crazy considering the numerous router/gateway algorithms that I have discussed in this blog). He does acknowledge "more advanced routers", but does not seem to imply that many of them actually perform many, if any, of the advanced algorithms such as Fair Queuing.

Both sets of authors describe the basic design of the router, including its parallel design, crossbar switch and use of virtual output queuing (VOQ). The VOQ discussion is especially interesting because it shows how traumatic head-of-line blocking can be (nearly 50% of performance).

McKeown describes how VOQ + iSlip (a crossbar scheduling algorithm) can achieve close to the theoretical maximum of performance. They compare this with a crossbar like router but one which uses FIFO (and variable-length datagrams ...). Ironically, when we relate this work to other work discussed on this blog it becomes apparent that none of the gateway algorithms are actually being used ...

Another rather interesting relation between this work and previous work discussed is its attention to "controlling delay". Specifically, to deal with traffic that needs to have low latency they employ two mechanisms: priority levels and speedup (the latter I think is much more fascinating!). Priority levels are what they sound like, but speedup is kind of clever: operate the crossbar switch twice as fast as the external line! That way, with sufficient speedup one could actually guarantee that every datagram is immediately transferred to its output port, where a departure time can be precisely scheduled (which will eliminate the non-determinism inherent in the router otherwise, which screws up ones ability to reason about delay). They go on to discuss how in theory one would need a speedup of N (where N is the number of ports) to actually achieve the guarantee, but how in practice one can actually get away pretty well with a speedup of just two.

Scaling of routers for gigabit was published sometime in the mid '90s (it was actually a description of the work McKeown did at Cisco for the Cisco 12000 router). The later paper (2003) discusses the terabit routing work, and emphasizes the need to use optics versus electornics (at least in some parts). I would be curious to know how necessary terabit routing really is today ...

More Congestion Avoidance (High-Bandwidth, High-Latency Networks) and Future Internet Design Issues (Real-Time Applications)

I've been absent for the past week (recovering from a concussion) and so for those reading in the blogosphere you will find a rather condensed version of my thoughts (and feelings) of the following papers:

Random Early Detection Gateways for Congestion Avoidance
Congestion Control for High Bandwidth-Delay Product Networks
Fundamental Design Issues for the Future Internet
Supporting Real-Time Applications in an Integrated Services Packet Network: Architecture and Mechanism

As the title suggests, these papers address two distinct topics: congestion avoidance/control and future internet design considerations (especially for latency intolerant applications).

The former, congestion avoidance and control, has been discussed to great detail in some previous posts and these authors found yet more issues with existing techniques. Random Early Detection (RED) gateways were found to be effective (if used with cooperating end-nodes) at doing congestion control by doing something rather remarkable: they choose a connection to "throttle down" with the probability equal to the connection's share of the bandwidth! (Note that this still avoids biasing bursty traffic.) I find this to be a nice little result ...

Of course, if Fair Queuing wasn't good enough, Core-Stateless Fair Queuing wasn't good enough, then there is no reason to believe Random Early Detection should be good enough. And it, for the most part, isn't. So, Katabi, Handley, and Rohrs propsed XCP (eXplicit Control Protocol). Briefly, they take ignore backward compatibility and consider a control theoretic design. For one, this means they decouple fairness and efficiency control and provide explicit congestion notifications (versus TCP which has no notifications but instead relies on a timeout which they assume implies a lost packet, i.e. congestion which may be either due to fairness or efficiency).

The work of Katabi et. al. is especially interesting because it shows how poorly TCP scales with increased bandwidth and latency. These authors used satellite as the canonical example, however, I'm curious if we have actually approached similar limitations with the optic networks of today ... (in other words, how bad is TCP really? how bad is slow-start really?).

The subsequent topic was especially motivated by real-time applications. Such latency intolerant applications were not actually new (they had been doing speech-processing on the Internet since 1980's ... before I was born!), but they had accounted for much of the traffic in networks.

In Shenker's "Fundamental Design Issues ..." he discusses the need to extend the current architecture to include a new "service model". That way, flows that have special needs (i.e. real-time applications) can be treated differently than other latency tolerant flows (i.e. file transfer). He discusses a host of issues, but I found the discussion about how to "choose a service" most interesting. He expands on the issues with implicitly supplied services (the gateway chooses the service) and touches on the biggest problem (in my mind) with explicitly supplied services (the application chooses the service): abuse (applications might find ways to get better network utilization if they lie about their actual needs). Much to my enjoyment he proceeds to discuss "incentives", and how that might quell the issue (which turns into a discussion of pricing, etc).

I'm not sure that we really have done much of anything for real-time applications (at least not in the hardware, clearly, we have made improvements in the software), and so perhaps it is still a valid issue today.

Thursday, September 11, 2008

This paper attempts to resolve the complexity issues in implementing Fair Queueing inexpensively. In retrospect, it actually wasn't made that clear in "Analysis and Simulation of a Fair Queuing Algorithm" that FQ had such issues, so this paper is a nice continuation. Stoica et. al. propose that one can perform some decentralization and push certain complexities out to the edges of networks which in turn will yield better overall performance while still being fair. (This seems to be a design pattern in systems ... not sure exactly which one it would be though).

The authors call their system core-stateless fair queueing because state is only maintained on the edges of network "islands", where the internal routers just use an augmented FIFO policy to do routing. The augmentation of the FIFO algorithm investigates packet tags added by the edge routers which can help determine which packets should be dropped in order to achieve fairness.

Of the entire paper, I enjoyed Section 4 (Why Are Fair Allocations Important?) the most. As many of my other posts have mentioned, the efficacy of many of these algorithms have in some way relied on cooperation! As the authors go on to discuss, "it seems unsound to base the Internet's congestion control paradigm on the assumption of universal cooperation". YET, they go on to show how, at least for the allocation approach, the algorithms provide a strong incentive for the users of the system to provide some form of responsiveness to congestion control! This is a cool result to me ... of course, this was for drop-intolerant applications. To provide incentives for drop-tolerant flows it may also be necessary to use some sort of punishment scheme.

Analysis and Simulation of a Fair Queueing Algorithm

The authors hypothesize that queueing algorithms play a crucial component in effective network congestion control because: (1) flow control is only helpful for ensuring the presence of free buffer space at a receiver and (2) adaptive routing is only helpful for pushing packets away from network congestion/bottlenecks. Queuing algorithms, they speculate, which are considered only helpful for controlling the order in which packets are sent and the usage of a gateways buffer space, can also affect the way in which different packets from different conversations interact with one another which ultimately will affect the collective behavior of flow control algorithms.

The authors discuss how the most common queuing algorithm, first-come-first-served (FCFS), essentially "relegates congestion control to the sources". They propose Fair Queuing (FQ) as an alternative which allows one to allocate all of bandwidth, promptness, and buffer space fairly. Not only will this allow fair usage of resources, but it also provides a lower delay to lower throughput sources (like Telnet applications). They claim this as one of the prominent features of FQ.

They do some interesting simulations and have some great takeaway points:

FQ by itself do not provide adequate congestion control; they must be combined with intelligent flow control algorithms to at the sources.
Ill-behaved hosts get much less than their fair share because they have dropped packets even though they still get charged for their throughput.
Moreover, FQ protects well behaved sources from their "uncouth brethren", which ultimately rewards good sources.
FQ provides incentive for sources to implement Jacobson and Karels or some other intelligent flow control, where as FCFS makes using these "sacrificial".

I especially enjoyed their conclusion on how the game-theoretic perspective is important, in fact perhaps even necessary, to achieve their goals. (Again, this is further justification of how a well specified cooperative environment can perform incredibly well!)

This paper was an enjoyable read and I recommend it remain part of the curriculum. If I had any criticism I would have hoped they did more simulations that included more than one gateway (to see how these mechanisms work across larger networks).

I also think this is a prime source of information for anyone interested in looking at how virtual machine monitors do networking since, essentially, they are are gateway/router of sorts.

Tuesday, September 9, 2008

Congestion Avoidance and Control

The authors investigated whether "the 4.3 BSD (Berkeley UNIX) TCP was mis-behaving or if it could be tuned to work better under abysmal network conditions" after noticing a series of congestion collapses in the late 1980s. The conclusion to their hypothesis came in the form of seven new algorithms that were added to 4BSD TCP, including round-trip-time variance estimation, exponential retransmit timer backoff, slow-start, and others.

They rationalize their algorithms rather non-rigorously using the conservation of packets to explain a system operating stably (i.e. in equilibrium). Their new slow-start algorithm is meant to help the system conservatively reach equilibrium while their new round-trip-time estimation is meant to keep the system in equilibrium. For their new exponential backoff mechanism they also leverage linear system theory!

In addition, the authors also discuss the dynamic window sizing for congestion avoidance. For this they discuss how to perceive the feedback bit that Jain and Chiu propose and effectively get the same results.

This paper feels much like the the implementation of TCP for BSD ... slightly thrown together, all augmentations of an existing system meant to deal with common case situations appropriately (at least that is kind of how I always felt about the BSD TCP implementation). The paper is an interesting and enjoyable read, however, because it captures how novel and exciting networking was at this time and how rich the area seemed for possible optimizations. And yet, since the late 1980s has there been any new TCP implementations that have been adopted? The robustness argument for Jacobson and Karels seems to be, 'its good enough', which, for the most part, it is. Of course, there are a collection of people redesigning the Internet as we speak, but while operating systems have seen radical changes or additions over the years, the BSD networking stack feels like it has remained fairly consistent. (Moreover, almost every UNIX variation includes the BSD networking stack, including Mac OS X.) So, did this work have any important implications ... as I write this blog I'm leveraging exactly this work now!

Monday, September 8, 2008

Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks

The authors present an analysis of congestion avoidance (versus control) algorithms for computer networks. Specifically, they show that a simple additive increase and multiplicative decrease algorithm is necessary for convergence to an efficient and fair state (for which they provide definitions of each). In addition they present, albeit very superficially, some possible nonlinear algorithms (and how they actually generalize the linear ones) and practical considerations for implementing these algorithms (including how the sensitivity to system parameters causes nonlinear algorithms to be less useful in practice).

It seems that the authors motivation for pursuing this work was to overcome the increased queuing and congestion caused by the ever increasing heterogeneity of the Internet. In fact, without knowing more history/motivation than what is alluded to in the paper, it seemed that only congestion control mechanisms had been used/included in the original OSI specifications, rather than any congestion avoidance mechanisms.

The authors did a fantastic job of delivering their material. However, I have two criticisms:

First, they choose to assume "information such as X_goal and the number of users sharing the resource ... to be unknown by the users" which, as they say, "restricts the set of feasible schemes". Then, while they do present certain convergence conditions assuming global information and then revise them to satisfy their assumption (i.e. no global information), it seemed the analysis failed to discuss all of the shortcomings due to this loss of information. Specifically, what I would have liked to see, was an analysis that made it clear, for every bit (or abstract bit) of information lost, what functionality/performance the algorithm lost as well. (They do at least mention future work on the "marginal utility of increased bits of feedback".) I'll probably repeat this in my future post on Jacobson's and Karel's "Congestion Avoidance and Control", but I find the tradeoff in these "bits" of information incredibly interesting because, as the networking stack in BSD has shown, you can actually get away without receiving any explicit signals! So that makes me wonder, in the opposite extreme (lots and lots of signals) how much better could we do?!

My second criticism, one that perhaps the authors never dreamed of and yet a majority of the class will point out, was how these algorithms suffer in the face of malicious users. In fact, has there been recent work that has re-evaluated this analysis in the face of such malevolence?

A slight diversion: when considering resource management for parallel operating systems, what are, if any, the right feedback signals to send to applications? Should the new standard of application writing be similar to the way it works in networks? i.e. timeouts is to dropped packets as preemption is to revoked cpus (where I'm assuming a model like scheduler activations where an application is activated when a preemption occurs).

It seems fascinating to me that the solution for "dropped packets", i.e. to throttle down and restart, is one which suggests the necessity to cooperate with and fairly share the resources with other potential users. For most "desktop" applications today, writing code that "throttles down and restarts" (or some analogue) seems incredibly unnatural. In fact, most operating systems today DO NOT provide any feedback (bit or bits) to applications, but rather, keep applications completely agnostic of one another. Perhaps it was much more obvious, from the early days, that networking NEEDED to rely on cooperation to perform well. Now, as computers are becoming ever more parallel, there might be a lot of new ties with networking theory that pushes applications and operating systems into the cooperative realm!

This paper was a great read and provided what I thought was a good example of doing a formal analysis of a system. For that reason, I feel the paper should most definitely remain on the syllabus.

As the authors discuss, there seem to be lots of open research questions (including one I alluded to above that dealt with extra feedback bits). Without doing a literature survey or having greater depth in this area, it is hard to know how much of the future work they proposed is now work completed (or at least attempted).

This work was probably influential in many respects; perhaps the most notable influence this paper had was helping Jacobson and Karel justify (prove seems to strong here) their design decisions for the BSD TCP implementation.

Thursday, September 4, 2008

On Inferring Autonomous System Relationships in the Internet

The author discusses a heuristics based algorithm to determine AS relationships (which are often kept under some non-disclosure agreement for business purposes).

The author first provides necessary background information regarding Internet architecture and BGP routing policies and tables. The author then gives detailed definitions that capture certain graph phenomenon (downhill versus uphill edges, etc) before eventually discussing the algorithm for classifying transit pairs and edges (provider-to-customer, etc). The author then shows how well their algorthim performed experimentally on data collected from the Route Views server in Oregon.

This paper has a fairly substantial amount of definitions and explanations, which made its overall impression be a bit overwhelming. Unfortunately, it seemed to be the case that the author felt he needed to provide the necessary background information for his audience. That being said, there is a lot of good content that will probably be helpful in getting people to communicate about these issues. My only real critique of the paper was that I had a hard time relating to the motivation of the work and how it will enable future work (probably because this isn't my area).

That being said the paper still provides interesting information (section A. Internet Archtiecture and section B. Routing Policies and BGP Routing Tables) that I found valuable for understanding certain "basic" principles. For example, even statements like "the scalability of the internet routing protocol depends on the contiguousness of IP addresses" which are obvious in retrospect, seemed like a nice and helpful reminder about the "why" of certain Internet decisions.

Other than how this work will enable future work (which as I admitted before I don't completely understand), I think the important ramifications of this work is reinforcing how routing policies can sometimes give you results which are not, for certain customers, preferable. For example, the paper reinforced how local pref attributes override shortest path length attributes! This means my packet may take longer to get where I want it to go because my ISP makes more money by sending it the long way. In fact, it would be really interesting, and cool, to see if we could some how measure this (is it a really pervasive issue in the Internet? or because of the topology this issue rarely comes up?).

Interdomain Internet Routing

Hari Balakrishnan et. al. discuss Internet routing from both a technical and an economic perspective.

He begins his discussion explaining the simple properties of individual commercial entities, called autonomous systems (AS), and how the economics behind their interactions work. Specifically, he discusses the distinction of peering versus transit relationships between ASes and the importance of export and import policies, where the essence is of course "no ISP wants to act as transit for packets that it isn't somehow making money on". Ultimately, routes to paying transit customers are universally exported while routes received from providers are typically selectively exported (exported to paying customers but not peers). Routes are imported given the "customer > peer > provider" priority order, which makes sense monetarily.

Hari continues the discussion by investigating the technical side of BGP, specifically its three design goals: scalability, policy, and cooperation under competitive circumstances. BGP is essentially a TCP application that communicates differently depending on whether or not the source and destination routers are intra-AS or inter-AS. Eventually, a router determines which routes to use based on a priority of certain attributes regarding the route, a few internal metrics (iBGP versus eBGP) and a tie-break rule. It seems like most routes, however, are really just decided via the LOCAL PREF rule, which essentially reinforces our "customer > peer > provider" ordering, i.e. we should prefer customer over peer and provider routes.

It is hard to critique informative work like this ... nor do I have any glaring issues. In fact, I think this should definitely be included in the syllabus; it is an easy read and for people like me who had only a superficial understanding of BGP, it was very intriguing.

Hari does discuss a few open problems. The most interesting of which seems like the multi-homing issue. It seems like there has to be better technical solutions than the current solutions, and it might be fun to discuss this problem in more depth in class.

Tuesday, September 2, 2008

The Design Philosophy of the DARPA Internet Protocols

The paper explains some of the motivations and justifications about the design and implementation decisions of the Internet and its well known protocols. Perhaps the most gratifying part of the paper is the discussion about the necessity to separate TCP into IP and eventually add UDP (at least this bit of history clears up the oddity in the TCP versus UDP naming). Of course, this was gratifying because it reinforced the conclusions made in the paper "End-to-End Arguments in System Design".

In connecting these two papers further, I found the statement "a surprising observation about the control of variation in delay is that the most serious source of delay in networks is the mechanism to provide reliable delivery" to be most poignant. That is, it highlights the correctness versus performance trade-off that I discussed in my previous post. In this case it argues that performance may actually suffer from reliable delivery (where as before we were claiming we could/should use it for performance).

End-to-End Arguments In System Design

The paper argued that system design should not try and be overly aggressive in providing functionality that may, in the future, be seen as too restricting, unneeded, or even worse, redundant. The big take away message for me seems to be a recurring theme in computer science ... high-level semantics are hard to predict and reason about, but often are the key thing to exploit when you want good performance. In this case, the canonical example is some sort of telephony service, where the high-level semantics of the application PERMIT lost packets because the clients (in this case humans) can effectively handle the lost packets by, for example, asking for something be repeated. Clearly, knowing about this can help optimize how you actually write your network service.

Another aspect of this paper that I wanted to emphasize was the authors statement that the "reliability measures within the data communication system is seen to be an engineering tradeoff based on performance, rather than a requirement for correctness". Typically, I think that we have considered layering our applications on top of lower-level architectures (like TCP) because they provide a level of correctness for us that we don't have to worry about. It is rather interesting to consider that these authors believe the original intention of "reliability" was actually to benefit the performance of the application rather than the correctness of the application ... in my experience, applications get better (best?) performance when they can fine-tune everything to their needs, including, their own "reliability" semantics.