Showing results for 
Search instead for 
Did you mean: 

AMD Pensando™ DPU Software

Data Processing Units (DPUs) have become the backbone of next-generation data centers, powering many accelerated network functions (NF). AMD Pensando™ programmable DPUs provide many functions, such as:

sw caps.png

Programmability in the DPUs

DPUs have a programmable data path. AMD Pensando DPUs are capable of running P4 programs natively in their Match Processing Units (MPUs). P4 is a standardized open-source domain-specific language to program I/O processing devices on how to handle network traffic.



They can handle classic P4 functions such as packet parsing, manipulation, tunneling, and ACLs. AMD Pensando P4 programs can implement (periodic) timer events, handle asynchronous events triggered by state transitions, generate notifications, craft and send packets inline (e.g., IPFIX), etc., making it possible to implement complex stateful features and custom network protocols natively in the P4 data path. For example, network functions like TCP/TLS proxies, NVME over TCP, IPsec, Active-Active or Active-Passive HA state machines, and flow aging can be implemented inline in the fast path processors. Although the AMD Pensando DPU also has general-purpose CPU cores, the goal is to not use these to handle fast path data traffic, thus providing both programmability and performance at the same time. Utilization of CPUs for such services will degrade the fast path performance, scale, throughput (as measured by packets per second or PPS), and latency.

Software Architecture

Let us consider some of the varying requirements for various segments before we describe the resultant architecture of the DPU software layers.

Cloud Users

  • Very High Scale (#table entries)
  • High throughput - both packets per second (PPS) and connections per second (CPS)
  • Complex multi-staged data-path lookup, with custom/differentiating features
  • Higher feature velocity

Service Provider Users

  • Network in <-> Network out "Bump in the Wire" processing, not limited by host PCIe® bandwidth
  • High scale
  • High throughput (mostly packets-per-second)

Enterprise Users

  • Turnkey solution (not too keen on writing custom P4 programs)
  • Ecosystem integration with hardware/software already being used in the existing deployment
  • Longer release cycles

Despite these diverse needs, some requirements are common among all users, such as:

  • High Availability: failure handling, recovery
  • Operations: Debuggability, metrics collection
  • Security: Platform security, and software stack security

To address these diverse requirements, it is imperative that the software architecture be built using modular and composable blocks that interact with each other using well-defined APIs so that any functional block can be replaced by cloud provider’s implementation at any point in time. The following diagram shows various foundational blocks and layers that form the overall DPU software architecture.



P4 Data Path

The P4 data path runs the data path business logic on the MPUs. This handles all the traffic in and out of the DPU. The feature set in this layer is specific to the cloud provider’s SDN network.

Software-in-Silicon Development Kit

To enable customers to write their own P4 data path, AMD provides a P4 compiler and associated toolkit to compile the P4 code to the AMD Pensando DPU’s backend. Several production quality P4 programs are provided for reference.

The AMD Pensando Software-in-Silicon Development Kit (SSDK) consists of all the tools, libraries, binaries, etc. that are needed to develop full software stack on Pensando DPUs. When P4 code is compiled, the compiler generates:

  • C APIs to be used by the control plane software, that handles the user configuration and programs the respective data path tables in P4
  • C APIs that can be used by the software data path components to program flows/sessions
  • gRPC protobuf read/write APIs for P4 tables that can be used by remote control plane or table monitoring tools (CLIs etc.)

These generated APIs use table management libraries that handle several types of SRAM, TCAM and DDR tables that are optimized for memory and performance without sacrificing the usability.

The SSDK also includes system and platform software which need not be re-invented by the customers developing their own pipeline. Core components like pciemgr, driver software, ASIC initialization and health monitoring, interrupt handlers, link manager, diagnostic libraries, sensors, resource monitoring entities, are part of this common re-usable layer provided by AMD.



Soft Datapath

Typically, the software data path component is tightly coupled with the P4 data path and handles only flow/session insertion deletion. When the first packet of the session arrives, flow lookup in the P4 data path fails and the packet is punted to this exception path, which is also sometimes known as slow path. AMD Pensando DPUs can handle security policy evaluation, route table lookups, metering bucket derivation, and compute packet rewrites all in the P4 data plane itself and provide the lookup results to this component in the form of a metadata header. The software data path component can then take these table lookup results and install flow/session entries appropriately. Some customers may want to evaluate additional policies here in the beginning and then gradually offload them to P4 over time to provide higher connections per second (CPS). Customers can bring in their own soft data path implementations, probably on top of to dpdk or vpp framework or anything that is proprietary.

Control Plane

The control plane consists of any L2/L3 protocol software stack. For example, for underlay routing in the SDN, routing protocols like BGP/OSPF are run to exchange routes in the underlay VRF with the upstream Top-of-Rack (ToR) switches. Similarly, when eVPN is enabled, MAC and IP addresses learnt locally are advertised to the rest of the network. Customers can replace this layer with open source alternatives such as FRR or GoBGP, or their own implementation of such protocols.

Management Plane

The management plane consists of multiple policy agents that communicate with the external controller(s) (e.g., a cloud provider’s compute controllers or the AMD Pensando Policy and Services Manager (PSM) in the case of the AMD Pensando full stack solution). These agents receive configuration from the controller(s) external to the DPU and program the data path tables using SSDK libraries. As these agents communicate with the external controller(s), for pulling configuration and pushing out telemetry information, these agents have inherent dependency on the controller’s object model and communication mechanics. Hence, this layer depends on the customer deployment environment and is fully replaceable.


The DPU software architecture, layering and pre-built components are the key to unlock the power of the programmable I/O services. The AMD Pensando DPU software stack is designed to be flexible so that it can be deployed in many different customer environments without losing the programmability capability of the underlying ASIC, and without compromising on scale and performance. This "snap-block" design of the software stack enables rapid innovation and offers ways to integrate with customer software stack at multiple layers. It enables customers to take a full stack solution and customize it over time, gradually own the DPU software and take control of their destiny.


To find out more about AMD Pensando technology, including how to get access to the Software-In-Silicon Development Kit, please visit .