By Saqib Jang, principal and founder, Margalla Communications, technology analysis and consulting firm specializing in cloud infrastructure and services
Modern cloud-native workloads are more and more being designed as microservices running on servers distributed across data center networks.
While microservices-based applications have many benefits, they do place heavy demands on network communications which, in turn, impacts the availability of server CPU cycles that support end-user applications.
To improve the efficiency of communications processing, including processing of storage, networking, security, virtualization, and ancillary functions, a new gen of NICs, called SmartNICs, have emerged to provide protocol accelerations and additional processing power with programmability for offloading communications processing from server CPUs. A SmartNIC (also sometimes referred to as Data Processing Unit (DPU) or Infrastructure Processing Unit (IPU)) can be based on ASIC, FPGA or SoC technologies.
This article describes what constitutes a SmartNIC while highlighting select SmartNICs from among the market-leaders. Intel Corp., Marvell Technologies, Inc. and start-up contender Fungible, Inc. provide stand out examples of the value SmartNICs can bring to data centers and why you should consider investing in them.
For quite a long time, semiconductor technology-based performance improvements, together with the implementation of performance-enhancing features devised for CPUs, hid the challenges of inefficient implementation of communications processing. Because such benefits attributable to ‘Moore’s Law’ have shrunk, it has become painfully clear new approaches are required. A major hurdle in the availability of such approaches is that the programmability of server CPUs is such an adaptable and effective tool that that there is great motivation to address all processing needs with this single solution.
A traditional (or ‘foundational’) NIC supports essential network connectivity functions while performing simple offloads like checksum and segmentation while utilizing server CPUs for essential communications functions such as storage, networking, security, and virtualization functions. However, the trend towards disaggregated microservice-oriented architectures creates substantial software-based communications overhead that strongly impacts the availability of server CPU cycles for revenue-generating, tenant applications.
In view of this, the past 10 years have witnessed the emergence of in the market of SmartNIC products for offloading storage, networking, security, and other communications functions from the server processors to programmable, compute enabled NICs so that the processing capacity of server CPUs is not wasted on housekeeping functions and utilized for customer applications.
Microsoft Corp. was a pioneer in the SmartNIC market. By 2015, through its Project Catapult, it had deployed FPGA-based SmartNICs as the default networking hardware into 1+million servers within its Azure public cloud. Starting in 2016, Amazon Web Services started deploying its ‘Nitro’ SmartNIC, to support storage and network virtualization and hypervisor functions. Most of the hyperscalers, cloud builders, telcos, and service providers with an aggregate 15 plus million servers have not deployed SmartNICs in volume until now.
Intel Infrastructure Processing Unit (IPU)
Intel Corp. has a long history in connectivity technologies and has continued that tradition by offering SmartNICs, including those with substantially expanded offload capabilities called Infrastructure Processing Units (IPUs). In October 2020, it introduced the Field Programmable Gate Array (FPGA)-based Intel FPGA SmartNIC 5000X platform for cloud data centers, which accelerated storage, networking, and security workloads from the server, freeing processor resources and simplifying management of the data center infrastructure.
This development was followed by an announcement of a suite of three IPU products at its August 2021 ‘Architecture Day’ event, including an FPGA-based IPU (codename Oak Springs Canyon), featuring Agilex FPGA and a Xeon-D SoC, that provides customers flexibility in offloading 2x100G workloads and implementing new protocols quickly. Oak Springs Canyon also features a scalable, source-accessible HW/SW stack, Intel Open FPGA Stack, which enable developers to use their preferred OS, management and orchestration frameworks. The highlight of these announcements, however, was the company’s first ASIC-based IPU (codename Mount Evans) which was co-designed with a large cloud provider. Mount Evans implements a hardware-based data path, including an Intel Optane derived NVMe offload engine, RoCE v2 offload and a new reliable transport technology, as well as IPSec in-line encryption and decryption to secure every packet being sent across the network.
Mount Evans also implements a compute complex for control plane functions having 16 Neoverse N1 high frequency cores licensed from ARM, and a lookaside cryptography engine and a compression engine built from Intel’s QuickAssist Technology, for offloading these two jobs from the host CPUs. At a top-level, Intel’s use of ARM instead of x86 is unconventional. That said: the fact is that ARM has soared to become the defacto processor standard for SmartNICs, due to the fact of its use by most leading SmartNICs.
“Mt. Evans ASIC IPU is built from the ground up ensuring that the compute complex is tightly coupled with the network subsystem,” said Brian Niepoky, director, connectivity group marketing, Intel. “This tight coupling allows for the network sub-system accelerators to use the system level cache as a last-level cache; providing high bandwidth low-latency connections between the two. Additionally, it enables a flexible combination of hardware and software packet processing.“
The programmability of both the hardware-based data path using the on-board offload capabilities and the software-based control plane operating in combination with an infrastructure OS stack running on the Mount Evan’s on-board processors makes it a flexible and compelling IPU.
Marvell Octeon LiquidIO
Marvell Technology, Inc. is a pioneer in the SmartNIC market with its LiquidIO cards targeted at cloud data centers which have shipped over one million units. At the heart of LiquidIO SmartNIC adapters is the Octeon family of processors which are designed for user-defined infrastructure offloading and application acceleration in hyperscale datacenters. This class of processing now referred to as a DPU was initiated by Marvell with the initial launch of OCTEON processors in 2005.
The latest Marvell SmartNIC solution, LiquidIO III, is a SmartNIC platform that incorporates the latest OCTEON TX2 DPU with 36 ARM V9 N2-based cores and 5X100Gb/s network connectivity and is architected to boost security, packet processing, tunneling and traffic management. The OCTEON TX2 processor offers up to 2.5X higher performance compared to the previous OCTEON gen with 10Gb/s-200GB/s throughput, depending on the specific model.
It comes with a software development kit (SDK), an open platform which leverages the Arm ecosystem. It includes support for multiple Linux distributions, virtualization, containers, a data-plane development kit, protocol stacks, infrastructure management and orchestration like OpenStack and Kubernetes, and virtual network functions (VNFs).
The DPDK networking suite supports performance optimized solutions for crypto, IPSec, TLS, network traffic management and packet processing. This makes the solution ideally suited for data processing, software-defined networks, network overlay methodologies, virtual appliances within the data center.
“Marvell’s Octeon LiquidIO SmartNIC solutions offer best-in-class performance, capacity, reliability, and programmability,” declares John Sakamoto, VP, infrastructure processor business unit, Marvell. “The solutions improve system and workload efficiencies and performance, strengthen security, and offer customers faster time-to-market.“
Marvell is engaged with multiple cloud hyperscaler data center operators to develop customized solutions allowing them to combine their own IP with Its hardened and widely deployed OCTEON LiquidIO SmartNIC DPUs. This offering allows them to optimize their data center infrastructure and offer innovative cloud services.
Fungible Data Processing Unit (DPU)
Fungible, Inc. is a technology start-up that recently launched its DPU and NVMe/TCP storage target products. The Fungible DPU is a flexible and high-performance solution addressing the challenges of server CPU overhead due to storage, security, networking, and additional communications. The underlying thesis behind the need for the DPU is the skyrocketing of east-west communications in cloud data centers due to resource (e.g., storage and accelerator) disaggregation and on-demand infrastructure pooling and, consequently, orders of magnitude increase in data center communications processing.
The Fungible DPU focuses on offloading such processing from server CPUs in a much more flexible and efficient manner than SmartNICs or other available approaches. It aims to provide a comprehensive solution to these problems by using a fundamentals-based, clean sheet design that is uncluttered by legacy considerations.
“SmartNICs combine a hardwired data-path with general purpose cores for control plane processing, but the loose coupling between the hardwired and programmable parts ensures a brittle design. Thus, when flexibility is needed to implement stateful processing, such as storage initiator, which involves data manipulation like compression, encryption, and de-dupe, SmartNICs fail to deliver on performance,” states Wael Noureddine, chief architect, Fungible.
The Fungible DPU’s control plane supports standard Linux applications and control agents. The data plane of the DPU is also programmable using C or other high-level languages, allowing infrastructure-centric processing to be expressed naturally and intuitively.
The Fungible DPU was designed with the goal of supporting arbitrary new stateful workloads where many different threads of computation (flows) need to be executed concurrently. As a result, it supports a variety of data-centric applications including stateless and stateful processing without comprising performance. These infrastructure services include storage, networking, security, and virtualization as well as primitives for applying a range of transformations to data in motion.
A fundamental requirement for networks in scale-out data centers is support for scalability across many orders of magnitude, full any-to-any cross-sectional bandwidth, low and predictable latency, fairness, congestion avoidance, fault tolerance, as well as end-to-end software defined security, all the while supporting industry standard protocols and devices. Surprisingly, the industry has no solution for this problem despite many decades of effort.
In this context, the Fungible DPU also implements TrueFabric, an end-to-end networking technology that provides a standards-based solution for supporting large-scale connectivity, full cross-sectional bandwidth, and low latency, without requiring any changes to switches and routers. As a result, TrueFabric can change a standard IP-over-Ethernet data center network into a scale-out fabric that performs like the backplane of a large extended computer.
As discussed in this article, industry leaders and start-ups are already offering SmartNICs which are poised to have a major impact on the transition to cloud-native architectures. SmartNICs offload storage, networking, security, and additional functions from server CPUs, thereby increasing network application performance. They are programmable and can be programmed for support of specific data-intensive functions. By deploying SmartNICs, cloud providers can deliver improved revenue-earning services with only a relatively small boost in investment.