How to Manage XDP/eBPF Effectively for Better DDoS Protection

Extended Berkeley Packet Filter’s ability to enable quick, uninterrupted updates makes it ideal for handling frequent security configuration changes.

Apr 26th, 2024 7:21am by Ivan Koveshnikov

Featued image for: How to Manage XDP/eBPF Effectively for Better DDoS Protection

Featured image by William Bout on Unsplash.

Extended Berkeley Packet Filter (eBPF) maps serve as a sophisticated interface for atomic updates of shared memory segments, which function as shared memory and provide a robust configuration interface for eBPF programs. The read-copy-update mechanism minimizes performance footprint in the hot path. Additionally, eBPF maps allow exclusive access to shared memory fragments. They can handle a mix of map types — arrays, hash tables, bloom filters, queues and ring buffers — which makes them perfect for complex configurations, such as security.

As configuration complexity grows, so does the need for more connections between different maps’ entries. If there are too many connections between map entries, the ability to make atomic configuration updates starts to falter. Updating just one map entry might mean having to update others at the same time, which could result in inconsistency during the update period.

Applying XDP for Advanced Traffic Management

Consider a simple eXpress Data Path (XDP) program that classifies and filters traffic based on a prioritized five-tuple ruleset. The program processes the next packet based on a combination of the rule’s priority and the packet’s source IP address, destination IP address, protocol, and source and destination port.

Flowchart with classify leading to process.

Here are examples of rules for a network configuration:

Always allow any traffic from subnet A.
Restrict access to web servers in subnet B for clients from subnet C.
Restrict access to web servers in subnet B.
Deny all other access.

These rules require storing both traffic classification rules and restrictions in the configuration, which can be achieved by using eBPF maps.

Understanding eBPF Program Configuration as a Tree Structure

You can visualize configurations as a hierarchical tree, with a “configuration root” at its base serving as the foundation. This root, which may be virtual, organizes various configuration entities to form the active configuration. Entities either connect directly to the root for immediate global access or nest within other entities for structured organization.

Accessing a specific entity begins at the root, progressing sequentially (“dereferencing,” level by level) to the desired entity. For example, to retrieve a Boolean flag from an “options” structure within a collection, you navigate to the collection, locate the structure and then retrieve the flag.

Gcore’s Approach to eBPF Complexity Challenges

This tree-like structure offers flexibility in configuration management, including atomic swaps of any subtree, ensuring smooth transitions without disruption. However, increased complexity brings challenges. As configurations become more intricate, the interconnections among entries intensify. It’s common for several parent entries to point to a single child entry or for an entry to play dual roles, acting as a property of one entity while also being part of a collection.

Modern programming languages have developed mechanisms to manage complex configurations. Developers use reference counters, mutable and immutable references, and garbage collectors to ensure safe updates. However, safety in managing these configurations doesn’t guarantee atomicity when switching between configuration versions.

The ever-changing landscape of online traffic means security operations teams must make frequent changes to security policies. Therefore, Gcore makes rapid and frequent updates to Gcore DDoS Protection and incorporates vital features like the regular expression engine. We moved beyond the standard one or two daily updates for self-hosted solutions, to near-constant updates required by service providers. This need, often overlooked in Linux applications, led us to embrace eBPF technology, which enables quick, uninterrupted updates.

Our progress towards this solution required a thorough exploration of strategies to make sure we were handling our eBPF configurations in the best possible way. Specifically, the limitations of eBPF maps led our team to rethink our configuration storage strategies. The inability of eBPF map entries to store direct pointers to arbitrary memory segments, due to kernel safety verifications, necessitates using search keys for map entry access, slowing down the lookup process. But this drawback offers a benefit: It allows us to divide complex configuration trees into smaller, more manageable segments, linked directly to the configuration root. The result? Consistency, even during nonatomic updates.

Our findings and tactics underscore the importance of careful planning and execution required for optimal efficiency of eBPF programs. So now, let’s turn to specific configuration update strategies for eBPF environments and their suitability for systems’ unique requirements and limitations.

Strategies for Safe Configuration Updates

We found three update strategies to be particularly effective in enhancing program updates while ensuring high performance and flexibility.

Update Strategy 1: Step-by-Step Transition

A step-by-step update strategy means undertaking incremental configuration updates across several maps. It’s a useful option when processing data in one map provides a lookup key for another map. In such cases, where multiple map entries need to be updated, atomic transitions are not feasible. But precise and sequential update operations make it possible to update the configuration methodically.

Some operations on referenced configuration subtrees become safe if executed in the correct order. For example, in the context of classification and processing, the classification layer provides a lookup key for a matching security policy, meaning update operations should follow a specific sequence:

Inserting a new security policy is safe since new policies are not yet referenced.
Updating an existing security policy is also safe, as updating them individually generally presents no issues. Although an atomic update would be desirable, it doesn’t offer significant advantages.
Updating classification layer maps to reference new security policies and remove references to obsolete ones is safe.
Purging unused security policies from the configuration is safe once they are no longer referenced.

Even without atomic updates, it’s possible to perform a safe update by correctly ordering the update procedure. This approach works best for independent maps that are not closely linked with other maps.

We recommend performing incremental updates instead of updating the entire map at once. For instance, incremental updates to hashmaps and arrays are perfectly safe. However, that is not the case with incremental updates to longest prefix match (LPM) maps because the lookup depends on the elements already present in the map. The same problem arises when creating the lookup key for another table requires you to manipulate elements from multiple maps.

The classification layer, often implemented using several LPM and hash tables, offers an example of this complication:

Lookup flow from classify to LPM and hash, and from classify to process and then hash, with notes on map update issues.

Update Strategy 2: Map Replace

For maps that can’t be updated incrementally without inconsistencies, such as LPM maps, replacing the entire map is the best solution. To replace a map for an eBPF program, you need a map of maps. A userspace application can create a new map, populate it with the necessary entries and then atomically replace the old one.

Map of maps leads to two nodes with resource isolation and replace functions.

Dividing the configuration into separate maps, each describing the settings for a single entity, offers the added benefit of resource isolation, and removes the need to recreate a full configuration during minor updates. The configuration for each of the multiple entities can be stored in a replaceable map.

This approach has some drawbacks. The userspace needs to unpin the previous map to maintain the previous pin path, since the replacement map can’t be pinned to the same location as the previous map. This is particularly important to consider for long-lived programs that frequently update configurations and rely on map pinning for stability.

Update Strategy 3: Program Replace

When linking multiple maps together, the map replace method may fail. Updating maps individually can result in an inconsistent or invalid state reflecting neither the old nor the intended new configuration.

To address this issue, atomic updates should occur at a higher level. Although eBPF lacks a mechanism to replace a set of maps atomically, maps are usually linked to a specific eBPF program. Dividing the interconnected maps and corresponding code into separate eBPF programs, linked by tail calls, can address this.

Flowchart of packet pipeline to prog map, leading to replaceable code and map bundles for eBPF programs.

Implementing this requires loading a new eBPF program, creating and filling maps for it, pinning both, and then updating the prog map from userspace. This process is more labor-intensive than a simple map replacement, but it allows simultaneous updates of maps and associated code, facilitating runtime code adjustments. However, it’s not always particularly efficient to use this approach, especially when updating a single map entry in a complex program with multiple maps and subprograms.

Error Handling

Dealing with errors when managing eBPF can be tricky. It’s important to update configurations to prevent inconsistencies. If an error pops up during an update, it can cause confusion, so having automatic backups is helpful to reduce the need for manual fixes.

You can sort errors into two types: recoverable and unrecoverable. With recoverable errors, if something goes wrong during an update, you can simply stop and no changes are made. You can fix any errors without risk.

Unrecoverable errors are a bit trickier. You need to be careful with them as they impact specific configuration entities, which could disrupt the entire system.

It’s better to organize updates by configuration entity rather than update type. This way, if an error happens, it affects only the specific configuration entity, not everything at once. For instance, if different network segments have defined classification rules and security policies, it’s more effective to update them in separate cycles based on network segments than by update type. This makes it easier to handle automatic backups, and if an unrecoverable error happens, you know exactly where the impact is. Only one part of the network will have an inconsistent configuration, while the rest are either unaffected or can be quickly switched to a new configuration.

Managing eBPF Program Life Cycles for Updates

Keeping track of an eBPF program’s life cycle is key for programs requiring persistence, frequent updates and state retention across different code instances. For example, if an XDP program requires frequent code updates while maintaining existing client sessions, it is essential to manage its lifetime effectively.

For developers who want to maximize flexibility and avoid constraints, the goal should be to keep only the vital info between reloads — the data that can’t be sourced from nonvolatile storage. This way, you can make dynamic configuration adjustments with eBPF maps.

To make the hot code reload process more straightforward, you need to be able to tell state maps apart from configuration maps, reuse state maps during reloads and refill configuration maps from nonvolatile storage. Transitioning processing from an old to a new program and informing all eBPF map users about the change can be a bit of a headache.

There are two commonly used approaches to achieve the transition:

Atomic program replacement: This method involves directly attaching the XDP program to a network interface and swapping it out atomically during updates. This might not be the best fit for big, complex eBPF programs that interact with lots of userspace programs and maps.
libxdp-like approach: A dispatcher program is linked to the network interface and uses tail calls for processing in the next program from the prog map for the actual processing. Besides managing map usage and pinning, it coordinates multiple processing programs, enabling quick transitions between them.

Diagram shows the network interface card (NIC) attaching to the dispatcher, prog map and state maps leading to the actual program config.

The network interface card (NIC) attaches to the dispatcher, prog map and state maps leading to the actual program config.

The hot reload process enables rapid detection and correction of configuration issues, quickly reverting to a previous stable version when needed. For complex scenarios like A/B testing, the dispatcher can use a classification table to direct specific traffic flows to a new version of an XDP program.

Conclusion

Through eBPF/XDP programming, Gcore has pushed the boundaries of network security and performance optimization. Our journey showcases our dedication to combatting emerging threats with advanced eBPF/XDP features. As we keep improving our packet processing core, we’re dedicated to delivering cutting-edge solutions that help keep our customers’ networks robust and agile.

Ivan Koveshnikov is a passionate networking software developer focusing on intermediate and server-side programs. He has over six years of experience in DDoS protection, starting from the L3 layer and deep into L7 protection, such as HTTP or gaming applications,...