Variable Speed Limits for the Internet
A key feature of Telco 2.0 analysis is our effort to understand the limits of the end-to-end principle; how stupid can a stupid network be without being, well, stupid? Networks have to be intelligent in some ways; routing, for example, requires a lot of intelligence although restricted quite tightly to one task.
So we were very interested by a recent NANOG thread regarding improvements in how the Internet deals with major congestion on backbone links. Famously, the Internet is meant to route around damage, but this only works when there is enough route diversity to absorb the diverted traffic. In a major outage, for example the one that followed an earthquake in the Luzon Strait earlier this year, the problem is often that too many people are trying to fit through the remaining links at once.
This is where the fundamental principles of internetworking bite you in the behind; most Internet protocols work on the principle that, if one attempt to do something fails, you try again. TCP achieves reliable delivery by resending packets that are not acknowledged within their time-to-live, until a timeout. The problem is that if there is a major problem, very large numbers of users' applications will all try to resend; generating a packet storm and creating even more congestion.
So wouldn't it be nice if you could tell everyone to slow down?
This was roughly what the NANOG community was discussing. An "Internet busy signal" could be used by applications as a signal to slow down; restrict the creation of new TCP sessions, or the bandwidth of media applications. Propagating it across the Net could mean that rerouting would happen faster, and further away from the problem, thus improving the efficiency of the routing system. It would be a little like the variable speed limit the UK Department for Transport introduced on heavily used stretches of motorway; there, the speed limit is reduced some distance away from a problem in order to limit the rate at which cars reach the bottleneck to the maximum flow remaining.
However, there are some important differences; you can rely on the variable speed limit signs, because nobody is going around putting up their own unauthorised ones to slow down traffic in front of their billboards. And there's an enforcement mechanism; the police can fine you for breaking the limit.
On the Internet, though, any network could send the slow down signal to any party connected to it; and it would give a great deal of power to them. Hackers could release a flood of slow down packets with forged source addresses into a major Internet exchange point, causing a massive denial-of-service attack. A pre-requisite of any such system would be the full, worldwide adoption of BCP38, which foresees that all networks filter all incoming and outgoing traffic to verify its source. There are also serious issues of Internet freedom; a network acting in bad faith could introduce the messages to its system, pretending that they came from the source of content it wanted to suppress for commercial or political reasons.
This is only part of the problem. More seriously, if the "Slow Down" message was widely respected, Internet perfomance would be significantly better during disruptions. Therefore, someone who didn't respect the message might be able to get significantly better throughput than someone who did entirely because of the responsible citizens' sacrifice. Indeed, cheating would be a rational act for the individual, but an irrational one for the community; the classic tragedy of the commons. The more cheats, the less useful it would be to respect the message. Therefore, eventually the scheme would break down completely, and quicker if it was abused in the manner described above.
The good news is that cheats would be easily identifiable; the user sprouting 17 TCP connections and a gaggle of p2p streams whilst everyone else is maintaining a steady 30mph stands out to ISP-level network monitoring tools. Therefore, this abuse of privileges could be made chargeable - or alternatively, those users who behave themselves could be offered a rebate from their bill, conditional on their good behaviour. So, we can derive some rules for a successful scheme:
1. Don't be evil (now who said that?)
Only ever use it for its real purpose; user trust would collapse otherwise.
2. Vertrauen gut, Kontrolle besser
The German proverb means that trust is good, but it's better to check. All slowdown messages must be source-filtered.
3. Build good practice in
In developing the system, we should design evil uses out; for example, use a metric such as a maximum number of sessions per user, rather than traffic classification (which the cheats will evade anyway, for example by encrypting everything and working over port 443 with the HTTPS traffic)
4. Incentives matter
Irresponsibility should have a price; apply a tax.
And you don't need an IMS for that!