Earlier this month I wrote a blog post recounting the history of OpenFlow at ONF. It got me thinking about how to have impact, and potentially change the course of technology. The OpenFlow experience makes for an interesting case study.
For starters, the essential idea of OpenFlow was to codify what at the time was a hidden, but critical interface in the network software stack: the interface that the network control plane uses to install entries in the FIB that implements the switch data plane. The over-the-wire details of OpenFlow don’t matter in the least (although there were plenty of arguments about them at the time). The important contributions were (a) to recognize that the control/data-plane interface is pivotal, and (b) to propose a simple abstraction (match/action flow rules) for that interface.
In retrospect, one of the secrets of OpenFlow’s success was its seemingly innocuous origins. The original paper, published in ACM SIGCOMM’s CCR (2008), was a Call-to-Action for the network research community, proposing OpenFlow as an experimental open interface between the network’s control and data planes. The goal was to enable innovation, which at the time included the radical idea that anyone—even researchers—should be able to introduce new features into the network control plane. Care was also taken to explain how such a feature could be deployed into the network without impacting production traffic, mitigating the risks such a brazen idea could inflict on the network.
It was a small opening, but a broad range of organizations jumped into it. A handful of vendors added an “OpenFlow option” to their routers; the National Science Foundation (NSF) funded experimental deployments on University campuses; and Internet2 added an optional OpenFlow-substrate to their backbone. ONF was formed to provide a home for the OpenFlow community and ON.Lab started releasing open source platforms based on OpenFlow. With these initiatives, the SDN transformation was set in motion.
Commercial adoption of SDN was certainly an accelerant, with VMware acquiring the startup Nicira and cloud providers like Google and Microsoft talking publicly about their SDN-based infrastructures (all in 2012), but this was a transformation that got its start in the academic research community. Over time some of the commercial successes have adapted SDN principles to other purposes—e.g., VMware’s NSX supports network virtualization through programmatic configuration, without touching the control/data plane interface of physical networking equipment—but the value of disaggregating the network control/data planes and logically centralizing control decisions proved long lasting, with OpenFlow and its SDN successors running in datacenter switching fabrics and Telco access networks today.
The original proposal did not anticipate where defining a new API would take the industry, but the cascading of incremental impact is impressive (and perhaps the most important takeaway from this experience). Originally, OpenFlow was conceived as a way to innovate in the control plane. Over time, that flexibility put pressure on chip vendors to also make the data plane programmable, with the P4 programming language (and a toolchain to auto-generate the control/data plane interface) now becoming the centerpiece of the SDN software stack. It also put pressure on switch and router vendors to make the configuration interface programmable, with gNMI and gNOI now replacing (or at least supplementing) the traditional router CLI.
OpenFlow was also originally targeted at L2/L3 switches, but the idea is now being applied to the cellular network. This is putting pressure on the RAN vendors to open up and disaggregate their base stations. The 5G network will soon have a centralized SDN controller (renamed RAN Intelligent Controller), hosting a set of control applications (renamed xApps), using a 3GPP-defined interface (in lieu of OpenFlow) to control a distributed network of Radio Units. SD-RAN is happening, and has the potential to be a killer app for SDN.
One of the more interesting aspects of all of this is what happened to OpenFlow itself. The specification iterated through multiple versions, each enriching the expressiveness of the interface, but also introducing vendor-championed optimizations. This led to data plane dependencies, an inherent risk in defining what is essentially a hardware abstraction layer on top of a diverse hardware ecosystem. P4 is a partial answer to that. By coding the data plane’s behavior in a P4 program (whether that program is compiled into an executable image that can be loaded into the switching chip or merely descriptive of a fixed-function chip’s behavior) it is possible to auto-generate the control/data plane interface (known as P4RunTime) in software, instead of depending on a specification that evolves at the pace of standardization. (This transition to P4 as a more effective embodiment of the control/data plane interface is covered in our SDN book.)
It is now the case that the network—including the control/data plane interface—can be implemented, from top to bottom, entirely in software. OpenFlow served its purpose bootstrapping SDN, but even the Open Networking Foundation is shifting its focus from OpenFlow to P4-based SDN in its new flagship Aether project. Marc Andreessen's famous maxim that "software is eating the world" is finally coming true for the network itself!
A well-placed and smartly-defined interface is a powerful catalyst for innovation. OpenFlow has had that effect inside the network, with the potential to replicate the success of the Socket API at the edge of the network. Sockets defined the demarcation point between applications running on top of the Internet and the details of how the Internet is implemented, kickstarting a multi-billion dollar industry writing Internet (now cloud) applications. Time will tell how the market for in-network functionality evolves, but re-architecting the network as a programmable platform (rather than treating it as plumbing) is an important step towards improving feature velocity and fostering the next generation of network innovation.
The Accidental SmartNIC
The moment when the current generation of SmartNICs really captured my attention was during a demo at VMworld 2019. At the time, ESXi was formally supported on x86 processors only, but there had been a skunkworks project to run ESXi on ARM for several years. Since most SmartNICs have an ARM processor, it was now possible to run ESXi on it. I do remember thinking “just because you can do something doesn’t mean you should” but it made for a fun demo.
This certainly wasn’t my first exposure to SmartNICs. As a member of the networking team at VMware, I was periodically visited by SmartNIC vendors who wanted to offer their hardware as a way to improve the performance of virtual switching. And AWS had been subtly incorporating them into their EC2 infrastructure since about 2014 via the “Nitro” System. But as I looked more closely at SmartNIC architectures, I realized that I had actually been involved in an earlier incarnation of the technology in the 1990s–not that we called them SmartNICs then. Even the term NIC was not yet standard terminology. Below is a slightly prettified diagram from a paper I published in SIGCOMM in 1991.
If you compare this to the block diagram of a current generation SmartNIC (e.g., here), you will see some pretty remarkable similarities. Of course you need to connect to a host bus on one side; that’s likely to be PCIe today. (That choice was much less obvious in 1990.) And you need the necessary physical and link layer hardware to connect to your network of choice; today that’s invariably some flavor of Ethernet, whereas in 1990 it still seemed possible that ATM would take off as a local area network technology (it didn’t). In between the host and the network, there’s one or more CPUs, and some programmable hardware (FPGA). It’s the programmability of the system, delivered by the CPU and FPGA, that makes it “Smart”.
To be clear, I definitely didn’t invent the SmartNIC. The earliest example that I can find was described by Kanakia and Cheriton in 1998. Other researchers around this time took a similar approach. There was a reason we gravitated towards designs that were relatively expensive but highly programmable: we didn’t yet know which functions belonged on the NIC. So we kept our options open. This gave us the ability to move functions between the host and the NIC, to experiment with new protocols, and to explore new ways of delivering data efficiently to applications. This was essentially my introduction to the systems approach to networking: building a system to experiment with various ways of partitioning functionality among components, and seeking an approach that would address end-to-end concerns such as reliability and performance. I was fortunate to be influenced in the design of my “SmartNIC'' by David Clark, the “architect of the Internet” and co-author of the end-to-end argument, and this work also led to my collaboration with Larry Peterson.
The 1990s, in retrospect, was a time when a lot of questions about networking were still up for debate. As we tried to achieve the then-crazy goal of delivering a gigabit per second to a single application, there was a widespread concern that TCP/IP would not be up to the task. Perhaps we needed completely new transport protocols, or a new network layer (e.g., ATM). Perhaps transport protocols were so performance-intensive that they needed to be offloaded to the NIC. With so many open questions, it made sense to design NICs with maximum flexibility. Hence the inclusion of a pair of CPUs and some of the largest FPGAs available at the time.
By the 2000s, many of these networking questions were addressed by the overwhelming success of the Internet. TCP/IP (with Ethernet as the link layer) became the dominant networking protocol stack. There turned out to be no problem getting these protocols from the 1970s to operate at tens of gigabits per second. Moore’s law helped, as did the rise of switched Ethernet and advances in optical transmission. As the protocols stabilised, there wasn’t so much need for flexibility in the NIC, and hence fixed-function NICs became the norm.
Jump ahead another ten years, however, and fixed-function NICs became a liability as new approaches to networking emerged. By 2010 NICs frequently included some amount of “TCP offload”, echoing one of the concerns raised in the 1990s. These offloads left hosts free to transfer large chunks of data to or from the NIC while the NIC added the TCP headers to segments on transmit and parsed them on receipt. This was a performance win, unless you wanted anything other than a simple TCP/IP header on your packets, such as an extra encapsulation header to support network virtualization. The optimization of performance for the common case turned into a huge handicap for innovative approaches that couldn’t leverage that optimization. (My colleagues at Nicira found some creative solutions to this problem, ultimately leading to the GENEVE encapsulation standard).
As networking became more dynamic with the rise of SDN and network virtualization (and the parallel rise of software-defined storage) it started to become clear that once again the functions of a NIC could not be neatly tied down and committed to fixed-function hardware. And so the pendulum swung back to where it had been in the 1990s, where the demand for flexibility warranted NIC designs that could be updated at software speeds–leading to what we might call the second era of SmartNICs. This time, it’s the need to efficiently support network virtualization, security features, and flexible approaches to storage that demands highly capable NICs. While all these functions can be supported on x86 servers, it’s increasingly more cost-effective to move them onto a SmartNIC that is optimized for those tasks and still flexible enough to support rapid innovation in cloud services. This is why you see projects like AWS Nitro, Azure Accelerated Networking, and VMware’s Project Monterey all moving functions that you expect to see in a hypervisor to the new generation of SmartNICs.
Why did I title this post “The Accidental SmartNIC”? Because I wasn’t trying to make a SmartNIC, there was just so much uncertainty about the right way to partition our system that I needed a high degree of flexibility in my design. (It’s also a nod to the excellent film “The Accidental Tourist”.) Determining how best to distribute functionality across components is a core aspect of the systems approach. Today’s SmartNICs exemplify that approach by allowing complex functions to be moved from servers to NICs, meeting the goals of high performance, rapid innovation, and cost-effective use of resources. Building a platform that supports innovation is a common goal in systems research and we see that playing out today as SmartNICs take off in the cloud.
Defining "A Systems Approach"
Last week we noticed that our book, Computer Networks: A Systems Approach, was discussed in a thread on Hacker News. It was nice to see mostly positive commentary, but we also noticed a fairly involved debate about the meaning of “Systems Approach”. Some readers had a pretty good idea of what we meant, others mistakenly took it for a reference to “Systems and Cybernetics”, which we definitely never intended. Others interpreted it as an empty, throw-away term. “Don’t these people read prefaces?” we thought, before remembering that we dropped the definition from the latest edition, thinking it was old news. Clearly, we had been making some assumptions that left many of our readers in the dark. Rather than just rescuing the old preface from the recycling bin, we thought it would be timely to revisit the meaning of “Systems Approach” as we’re now building a whole series of books around that theme.
The term “Systems” is used commonly by computer science researchers and practitioners who study the issues that arise when building complex computing systems such as operating systems, networks, distributed applications, and so on. At MIT, for example, there is a famous class 6.033: Computer System Design (with an excellent accompanying book) that is a typical introduction to the systems field. The required reading list is a tour through some of the most influential systems papers. The key to the systems approach is a "big picture" view – you need to look at how the components of a system interact with each other to achieve an overall result, rather than fixating on a single component (either unnecessarily optimizing it or trying to solve too many problems in that one component). This is one of the important takeaways of the End-to-End Argument, a landmark paper for system design.
A systems approach also has a strong focus on real-world implementation, with the Internet being the obvious example of a widely-deployed, complex networking system. This seems incredible now, but when we wrote our first edition in 1995, it was not yet obvious that the Internet would be the most successful networking technology of all time, and organising our book around the principles that underlie the design and implementation of the Internet was a novel idea.
The Systems Approach is a methodology for designing, implementing, and describing computer systems. It involves a specific set of steps:
In following this methodology, there are requirements that come up again and again. Scalability is an obvious example, and appears as a key design principle throughout networking, e.g., in the partitioning of networks into subnets, areas, and autonomous systems to scale the routing system. A good example of cross-disciplinary systems thinking is the importing of techniques developed to scale distributed systems such as Hadoop to solve scaling challenges in software-defined networking.
Generality is another common requirement: the way that the Internet was designed to be completely agnostic to the applications running over it and the class of devices connected to it distinguishes it from networks like the phone network and the cable TV network, whose functionality has now been largely subsumed by the Internet.
And there are a set of system-agnostic design principles that are used extensively to guide systems designers. They are not mathematically rigorous (compared to, say, Maxwell’s Equations or the Shannon-Hartley theorem) but are considered best practices:
In applying the Systems Approach applies to networking, and to our books, you’ll notice that we start every chapter in Computer Networks with our problem statement. In chapter 1 we go on to develop requirements for a global network that meets the needs of various stakeholders, satisfies scaling objectives, manages resources cost-effectively, and so on. Even though the Internet is already built, we’re walking the reader through the system design process that led to it being a certain way, so that they are learning systems principles and best practices like those mentioned above. We call many of these out explicitly in “Bottom Line” comments such as this one.
One of the most challenging aspects of teaching people about networking is deciding how to handle layering. On the one hand, layering is a form of abstraction–a fine system design principle. On the other hand, layering can sometimes prevent us from thinking about how best to implement the system as a whole. For example, in recent years it’s become clear that HTTP, an application layer protocol, and TCP, a transport layer protocol, don’t work terribly well together from a performance perspective. Optimizing each independently could only take us so far. Ultimately by looking at them as parts of a system that needs to deliver reliability, security, and performance to applications, both HTTP and the transport layer evolved, with QUIC being the new entrant to the transport layer. What we have tried to do is give readers the tools to see where such system-level thinking can be applied, rather than just teach them that the 7-layer model was handed down from on high and can’t be touched.
Hopefully this helps give some clarity around what we mean by “A Systems Approach”. It’s certainly a way of thinking that becomes natural over time, and we hope that as you read our books it will become part of your thinking as well.