Earlier posts talked about the softwarization of the network in fairly general terms, but the idea got rolling ten years ago with the introduction of Software Defined Networks (SDN).
The fundamental idea of SDN is to decouple the network control plane (i.e., where routing algorithms like RIP, OSPF, and BGP run) from the network data plane (i.e., where packet forwarding decisions get made), with the former moved into software running on commodity servers, and the latter implemented by white-box switches like the ones described in Section 3.4 of the book. The original enabling idea of SDN was to define a standard interface between the control plane and the data plane so that any implementation of the control plane could talk to any implementation of the data plane; this breaks the dependency on any one vendor’s bundled solution. The original interface is called OpenFlow, and this idea of decoupling the control and data planes came to be known as disaggregation.
OpenFlow was a great first step, but a decade of experience has revealed that it is not sufficient as the interface for controlling the data plane. This is for the same reason any API layered on top of hardware falls short: it does not expose the full range of features that switch vendors put into their hardware. To address this shortcoming, the SDN community is now working on a language-based approach to specifying how the control and data planes interact. The language is called P4, and it provides a richer model of the switch's packet forwarding pipeline.
Another important aspect of disaggregation is that a logically centralized control plane can be used to control a distributed network data plane. We say logically centralized because while the state collected by the control plane is maintained in a global data structure (e.g., a Network Map), the implementation of this data structure could still be distributed over multiple servers (i.e., it could run in a cloud). This is important for both scalability and availability, where the two planes are configured and scaled independent of each other. This idea took off quickly in the cloud, with today’s cloud providers running SDN-based solutions both within their datacenters and across the backbone networks that interconnect their datacenters.
A consequence of this design that isn’t immediately obvious is that a logically centralized control plane doesn’t just manage a network of physical (hardware) switches that interconnects physical servers, but it also manages a network of virtual (software) switches that interconnect virtual servers (e.g., Virtual Machines and containers). If you’re counting “switch ports” (a good measure of all the devices connected to your network) then the number of virtual ports in the Internet shot past the number of physical ports in 2012.
One of other key enablers for SDN’s success, as depicted in the Figure, is the Network Operating System (NOS). Like a server operating system (e.g., Linux, IOS, Android, Windows) that provides a set of high-level abstractions that make it easier to implement applications (e.g., you can read and write files instead of directly accessing disk drives), a NOS makes it easier to implement network control functionality, otherwise known as Control Apps. A good NOS abstracts the details of the network switches and provides a “network map” abstraction to the application developer. The NOS detects changes in the underlying network (e.g., switches, ports, and links going up-and-down) and the control application simply implements the behavior it wants on this abstract graph. What that means is that the NOS takes on the burden of collecting network state (the hard part of distributed algorithms like Link-State and Distance-Vector algorithms) and the control app is free to simply implement the shortest path algorithm and load the computed forwarding rules into the underlying switches. By centralizing this logic, SDN is able to produce a globally optimized solution. The published evidence confirms this advantage (e.g., Google's private wide-area network B4).
As much of an advantage as the cloud providers have been able to get out of SDN, its adoption in enterprises and Telcos has much much slower. This is partly about the ability of different markets to manage their networks. The Googles, Microsofts, and Amazons of the world have the engineers and DevOps skills needed to take advantage of this technology, whereas others still prefer pre-packaged and integrated solutions that support the management and command line interfaces they are familiar with. As is often the case, business culture changes more slowly than technology.
It is important to recognize the various perspectives on computer networks (e.g., that of network architects, application developers, end users, and network operators) to understand the technical requirements that shape how networks are designed and built. But this presumes all design decisions are purely technical, which is certainly not the case. Many other factors, from economic forces, to government policy, to societal influences, to ethical considerations, influence how networks are designed and built.
Of these, the marketplace is often the most influential, and corresponds to the interplay between network operators that sell access and connectivity (e.g., AT&T, Comcast, Verizon, DT, NTT, China Mobile), network equipment venders that sell hardware to network operators (e.g., Cisco, Juniper, Ericsson, Nokia, Huawei, NEC), cloud providers that host content and scalable applications in their datacenters (e.g., Google, Amazon, Microsoft), service providers that deliver content and cloud apps to end-users (e.g., Facebook, Apple, Netflix, Spotify), and of course, subscribers and customers that download content and run cloud applications (i.e., individuals, but also enterprises and businesses). Not surprisingly, the lines between all these players are not crisp, with many companies playing multiple roles. For example, service providers like Facebook run their own clouds and network operators like Comcast and AT&T own their own content.
The most notable example of this cross-over are the large cloud providers, who (a) build their own networking equipment, (b) deploy and operate their own networks, and (c) provide end-user services and applications on top of their networks. It's notable because it challenges the implicit assumptions of the simple "textbook" version of the technical design process. One such assumption is that designing a network is a one-time activity. Build it once and use it forever (modulo hardware upgrades so users can enjoy the benefits of the latest performance improvements). A second is that the job of designing and implementing the network is completely divorce from the job of operating the network. Neither of these assumptions is quite right.
On the first point, the network’s design is clearly evolving. The only question is how fast. Historically, the feature upgrade cycle involved an interaction between network operators and their vender partners (often collaborating through the standardization process), with timelines measured in years. But anyone that has downloaded and used the latest cloud app knows how glacially slow anything measured in years is by today's standards.
On the second point, the companies that build networks are almost always the same ones that operate them. The only question is whether they develop their own features or outsource that process to their venders. If we once again look to the cloud for inspiration, we see that develop-and-operate isn’t just true at the corporate level, but it is also how the fastest moving cloud companies organize their engineering teams: around the DevOps model. (If you are unfamiliar with DevOps, we recommend you read "Site Reliability Engineering: How Google Runs Production Systems" to see how Google practices it.)
What this all means is that computer networks are now in the midst of a major transformation, due largely to market pressure being applied by agile cloud providers. Network operators are trying to simultaneously accelerate the pace of innovation (sometimes known as feature velocity) and yet continue to offer a reliable service (preserve stability). And they are increasingly doing this by adopting the best practices of cloud providers, which can be summarized as having two major themes: (1) take advantage of commodity hardware and move all intelligence into software, and (2) adopt agile engineering processes that break down barriers between development and operations.
This transformation is sometimes called the “cloudification” or “softwarization” of the network, but by another name, it’s known as Software Defined Networks (SDN). Whatever you call it, this new perspective will (eventually) be a game changer, not so much in terms of how we address the fundamental technical challenges of framing, routing, fragmentation/reassembly, packet scheduling, congestion control, security, and so on, but in terms of how rapidly the network evolves to support new features and to accommodate the latest advances in technology.
This general theme is important and we plan to return to it in future posts. Understanding networks is partly about understanding the technical underpinnings, but also partly about how market forces (and other factors) drive change. That you are able to make informed design decisions about technical approach A versus technical approach B is a necessary first step, but that you are able to deploy that solution and bring it to market more rapidly and for less cost than the competition is just as important, if not more so.
Having not cracked open Computer Networks: A Systems Approach for several years, the thing that most struck me as I started to update the material is how much of the Internet has its origins in the research community. Everyone knows that the ARPANET and later TCP/IP came out of DARPA-funded university research, but even as the Web burst onto the scene in the 1990s, it was still the research community that that led the way in the Internet's coming-of-age. There's a direct line connecting papers published on congestion control, quality-of-service, multicast, real-time multimedia, security protocols, overlay networks, content distribution, and network telemetry to today's practice. And in many cases, the technology has become so routine (think Skype, Netflix, Spotify), that it's easy to forget the history of how we got to where we are today. This makes updating the textbook feel strangely like writing an historical record.
From the perspective of writing a relevant textbook (or just making sense of the Internet), certainly it's important to understand the historical context. It is even more important to appreciate the thought process of designing systems and solving problems, for which the Internet is clearly the best use case to study. But there are some interesting challenges in providing perspective on the Internet to a generation that has never known a world without the Internet.
One is how to factor commercial reality into the discussion. Take video conferencing as an example. Once there was a single experimental prototype (vic/vat) used to gain experience and drive progress. Today there is Skype, GoToMeeting, WebEx, Google Hangouts, Zoom, UberConference, and many other commercial services. It's important to connect-the-dots between these familiar services and the underlying network capabilities and design principles. For example, while today's video conferencing services leverage the foundational work on both multicast and real-time protocols, they are closed-source systems implemented on top of the network, at the application level. They are able to do this by taking advantage of widely distributed points-of-presence made possible by the cloud. Teasing apart the roles of cloud providers, cloud services, and network operators is key to understanding how and where innovation happens today.
A second is to identify open platforms and specifications that serve as good exemplars for the core ideas. Open source has become an important part of today's Internet ecosystem, surpassing the role of the IETF and other standards bodies. In the video conferencing realm, for example, projects like Jitsi, WebRTC, and Opus are important examples of the state-of-the-art. But one look at the projects list on the Apache Foundation or Linux Foundation web sites makes it clear that separating the signal from the noise is no trivial matter. Knowing how to navigate this unbelievably rich ecosystem is the new challenge.
A third is to anticipate what cutting edge activity happening today is going to be routine tomorrow. On this point, the answer seems obvious. It will be how network providers improve feature velocity through the softwarization and virtualization of the network. By another name, this is Software Defined Networking (SDN), but more broadly, this represents a shift from building the network using closed/proprietary appliances to using open software platforms running on commodity hardware. This shift is both pervasive and transformative. It impacts everything from high-performance switch design, to architecting access networks (5G, Fiber-to-the-Home), to how network operators deal with lifecycle management, to the blurring of the line between the Internet and the Cloud. Recognizing that this transformation is underway is essential to understanding where the Internet is headed next.