It seems that the authors motivation for pursuing this work was to overcome the increased queuing and congestion caused by the ever increasing heterogeneity of the Internet. In fact, without knowing more history/motivation than what is alluded to in the paper, it seemed that only congestion control mechanisms had been used/included in the original OSI specifications, rather than any congestion avoidance mechanisms.
The authors did a fantastic job of delivering their material. However, I have two criticisms:
First, they choose to assume "information such as X_goal and the number of users sharing the resource ... to be unknown by the users" which, as they say, "restricts the set of feasible schemes". Then, while they do present certain convergence conditions assuming global information and then revise them to satisfy their assumption (i.e. no global information), it seemed the analysis failed to discuss all of the shortcomings due to this loss of information. Specifically, what I would have liked to see, was an analysis that made it clear, for every bit (or abstract bit) of information lost, what functionality/performance the algorithm lost as well. (They do at least mention future work on the "marginal utility of increased bits of feedback".) I'll probably repeat this in my future post on Jacobson's and Karel's "Congestion Avoidance and Control", but I find the tradeoff in these "bits" of information incredibly interesting because, as the networking stack in BSD has shown, you can actually get away without receiving any explicit signals! So that makes me wonder, in the opposite extreme (lots and lots of signals) how much better could we do?!
My second criticism, one that perhaps the authors never dreamed of and yet a majority of the class will point out, was how these algorithms suffer in the face of malicious users. In fact, has there been recent work that has re-evaluated this analysis in the face of such malevolence?
A slight diversion: when considering resource management for parallel operating systems, what are, if any, the right feedback signals to send to applications? Should the new standard of application writing be similar to the way it works in networks? i.e. timeouts is to dropped packets as preemption is to revoked cpus (where I'm assuming a model like scheduler activations where an application is activated when a preemption occurs).
It seems fascinating to me that the solution for "dropped packets", i.e. to throttle down and restart, is one which suggests the necessity to cooperate with and fairly share the resources with other potential users. For most "desktop" applications today, writing code that "throttles down and restarts" (or some analogue) seems incredibly unnatural. In fact, most operating systems today DO NOT provide any feedback (bit or bits) to applications, but rather, keep applications completely agnostic of one another. Perhaps it was much more obvious, from the early days, that networking NEEDED to rely on cooperation to perform well. Now, as computers are becoming ever more parallel, there might be a lot of new ties with networking theory that pushes applications and operating systems into the cooperative realm!
This paper was a great read and provided what I thought was a good example of doing a formal analysis of a system. For that reason, I feel the paper should most definitely remain on the syllabus.
As the authors discuss, there seem to be lots of open research questions (including one I alluded to above that dealt with extra feedback bits). Without doing a literature survey or having greater depth in this area, it is hard to know how much of the future work they proposed is now work completed (or at least attempted).
This work was probably influential in many respects; perhaps the most notable influence this paper had was helping Jacobson and Karel justify (prove seems to strong here) their design decisions for the BSD TCP implementation.
No comments:
Post a Comment