How 5G is Disrupting Cloud and Network Strategy Today
A primary benefit envisaged of '5G' networks is that latency (i.e. delay times for users) will be massively reduced. This would deliver major benefits for many applications providing that the software for those cloud-based applications is located near enough to the users at the edge of the network. This is likely to drive a massive change in the architecture of the cloud and the network industries.
Our latest report outlines likely scenarios and identifies some early moves that are starting to play out now, such as the merger between Nokia and Alcatel.
There's also some interesting analysis that suggests that cloud players are not all just building bigger and bigger data centres (see chart below, for example) which gives some further support to the idea that the cloud is already becoming more local - or at least, is not just moving even further away,
The table of contents plus part of the introduction that outlines the importance of latency are below, and you can read more on our research portal here.
- Executive Summary
- 5G - a collection of related technologies
- The mother of all stretch targets
- Latency: the X factor [reproduced below]
- Latency: the challenge of distance [reproduced below]
- The economic value of snappier networks
- Only Half The Application Latency Comes from the Network
- Disrupt the cloud
- The cloud is the data centre
- Have the biggest data centres stopped getting bigger?
- Mobile Edge Computing: moving the servers to the people
- Conclusions and recommendations
- Regulatory and political impact: the Opportunity and the Threat
- Telco-Cloud or Multi-Cloud?
- 5G vs C-RAN
- Shaping the 5G backhaul network
- Gigabit WiFi: the bear may blow first
- Distributed systems: it's everyone's future
A big stretch, and perhaps the most controversial issue here, is the latency requirement. NGMN (the Next Generation Mobile Networks alliance) draws a clear distinction between what it calls end-to-end latency, aka the familiar round-trip time measurement from the Internet, and user-plane latency, defined thus:
"...measures the time it takes to transfer a small data packet from user terminal to the Layer 2 / Layer 3 interface of the 5G system destination node, plus the equivalent time needed to carry the response back."
That is to say, the user-plane latency is a measurement of how long it takes the 5G network, strictly speaking, to respond to user requests, and how long it takes for packets to traverse it. NGMN points out that the two metrics are equivalent if the target server is located within the 5G network. NGMN defines both using small packets, and therefore negligible serialisation delay, and assuming zero processing delay at the target server. The target is 10ms end-to-end, 1ms for special use cases requiring low latency, or 50ms end-to-end for the "ultra-low cost broadband" use case. The low-latency use cases tend to be things like communication between connected cars, which will probably fall under the direct device-to-device (D2D) element of 5G, but nevertheless some vendors seem to think it refers to infrastructure as well as D2D. Therefore, this requirement should be read as one for which the 5G user plane latency is the relevant metric.
This last target is arguably the biggest stretch of all, but also perhaps the most valuable.
The lower bound on any measurement of latency is very simple - it's the time it takes to physically reach the target server at the speed of light. Latency is therefore intimately connected with distance. Latency is also intimately connected with speed - protocols like TCP (Transmission Control Protocol) use it to determine how many bytes it can risk "in flight" before getting an acknowledgement, and hence how much useful throughput can be derived from a given theoretical bandwidth. Also, with faster data rates, more of the total time it takes to deliver something is taken up by latency rather than transfer.
And the way we build applications now tends to make latency, and especially the variance in latency known as jitter, more important. In order to handle the scale demanded by the global Internet, it is usually necessary to scale out by breaking up the load across many, many servers. In order to make this work, it is usually also necessary to disaggregate the application itself into numerous, specialised, and independent microservices. (We strongly recommend Mary Poppendieck's presentation at the link.)
The result of this is that a popular app or Web page might involve calls to dozens to hundreds of different services. Google.com includes 31 HTTP (Hypertext Transfer Protocol) requests these days and Amazon.com 190. If the variation in latency is not carefully controlled, it becomes statistically more likely than not that a typical user will encounter at least one server's 99th percentile performance. (EBay tries to identify users getting slow service and serve them a deliberately cut-down version of the site - see slide 17 here.)
Latency: the challenge of distance
It's worth pointing out here that the 5G targets can literally be translated into kilometres. The rule of thumb for speed-of-light delay is 4.9 microseconds for each kilometre of fibre with a refractive index of 1.47. 1ms - 1000 microseconds - equals about 204km in a straight line, assuming no routing delay. A response back is needed too, so divide that distance in half. As a result, in order to be compliant with the NGMN 5G requirements, all the network functions required to process a data call must be physically located within 100km, i.e. 1ms, of the user. And if f the end-to-end requirement is taken seriously, the applications or content that they want must also be hosted within 1000km, i.e. 10ms, of the user. (In practice, there will be some delay contributed by serialisation, routing, and processing at the target server, so this would actually be somewhat more demanding.)
To achieve this, the architecture of 5G networks will need to change quite dramatically. Centralisation suddenly looks like the enemy, and middleboxes providing video optimisation, deep packet inspection, policy enforcement, and the like will have no place. At the same time, protocol designers will have to think seriously about localising traffic - this is where the content-centric networking concept comes in. Given the number of interested parties in the subject overall, it is likely that there will be a significant period of 'horse-trading' over the detail.
It will also need nothing more or less than a CDN and data-centre revolution. Content, apps, or commerce hosted within this 1000km contour will have a very substantial competitive advantage over those sites that don't move their hosting strategy to take advantage of lower latency. Telecoms operators, by the same token, will have to radically decentralise their networks to get their systems within the 100km contour. Those content, apps, or commerce sites that move closer in still, to the 5ms/500km contour or further, will benefit further. The idea of centralising everything into shared services and global cloud platforms suddenly looks dated.
So might the enormous hyperscale data centres one day look like the IT equivalent of sprawling, gas-guzzling suburbia? And will mobile operators become a key actor in the data-centre economy?
See more on our research portal here.