Protect Sensitive Data and Prevent Bad Practices in Apache Kafka

Companies are increasingly streaming sensitive data within Apache Kafka; here’s how to help secure it against breaches, misuse and compliance fails.

Mar 21st, 2024 6:00am by James White

Featued image for: Protect Sensitive Data and Prevent Bad Practices in Apache Kafka

Featured image by James White.

In sectors such as finance, healthcare and retail, Apache Kafka usage increasingly includes streaming personally identifiable information (PII) and other sensitive data inside and outside the network. Customer data in Kafka signals deeper adoption of streaming, which is great news, but it also raises the question: is Kafka’s infrastructure adequately secured?

Kafka isn’t secure out of the box, unfortunately. There are access control lists (ACLs) suitable for using Kafka with a small number of applications. However, it can be cumbersome and time-consuming for larger organizations with complex access control requirements to manage ACLs. There is no built-in way to deal with sensitive information, which needs to be hidden for privacy reasons but still accessible enough for the development team to debug issues. Enterprises also increasingly need to share data with third parties in Kafka, which introduces additional security concerns.

If you’re tasked with implementing the Kafka security roadmap in your organization, you must consider your present security requirements and how they will change as you scale. Following are a few things to consider when establishing your Kafka security stance, such as access control, PII masking and data sharing best practices.

Using ACLs for Securing Kafka Can Be a DevOps Headache

Kafka ACLs are adequate for development, testing and smaller deployments but can become too complex to manage when used in production in large organizations with lots of users, topics and applications. This is primarily because ACLs are verbose and require specificity when writing access rules: Whenever a user or group is added or changed, you must add or remove an ACL entry, and the access lists become increasingly harder to maintain.

Another weakness with ACLs is that they don’t cover all of the resources in the Kafka ecosystem. While they can control access to Kafka topics and consumer groups, they can’t control access to other components in the ecosystem such as Kafka Connect and Schema Registry.

ACLs are also quite rigid, adding friction to security processes in enterprise use cases. Enterprises often need to grant and revoke access dynamically based on user rules, groups and other contextual information. This is a time-consuming manual process when working directly with ACLs. When granular permissions are inconvenient to implement, it’s tempting to grant blanket permissions to a user to reduce the number of access requests, which opens the door to potential data misuse.

Consider Kafka Security Practices as You Scale

While access control is a significant security factor, it’s not the only consideration for planning and implementing data security in Kafka. You must also consider the configuration of your deployment, methods of restriction, PII masking, data encryption and how and where your data is shared.

Add Guardrails Around Kafka’s Configuration Complexity

Guardrails around Kafka configuration concept

Source: Conduktor

Kafka is a complex system with complex configurations, and no matter how careful you are, mistakes happen. The best way to mitigate this is by making it harder to mess things up by adding guardrails. Placing a layer of abstraction between you, the configuration and ACLs helps ensure the configuration works as you intend and highlights potential issues.

This approach is also beneficial for Kafka producer settings. Although having a large number of producer settings provides flexibility, it presents risks such as misconfiguring batch sizes, creating performance bottlenecks and introducing compatibility issues. By contrast, having a set of rules that ensures acceptable producer configurations simplifies the producer configuration process, which greatly reduces the chances of making mistakes.

Enforce Time-Scoped Access for a Specific Purpose

Source: Conduktor

Developers creating or maintaining Kafka producers and consumers often need visibility into what’s happening inside the Kafka cluster for development and debugging purposes.

For example, if a developer is working on a streaming application for a retail company’s checkout team and something’s broken, they may need to inspect the state of a topic or look at the details of a failed-purchase message to understand and fix the problem. In such situations, making developers jump through hoops of filing access tickets will cause frustration and delay the resolution of the problem.

However, granting unlimited access to developers can produce unintended consequences; developers do make mistakes, and the best way to mitigate their impact is to reduce the time window for these mistakes to happen. This, alongside implementing least-privilege access to key resources, prevents a developer from accidentally performing an action outside their current remit that could negatively impact a Kafka cluster’s health.

Distinguish Between Human and Application Access

Human vs application access illustration

Source: Conduktor

ACLs and service accounts are usually sufficient for managing an application’s access to Kafka (except for the management complexity they introduce with a large number of topics and applications). Application access can be tightly scoped and mostly stays the same over time, so there’s no need to frequently adjust settings and potentially introduce a new problem.

Human access is different, however, and should be handled with a different set of tools. Human access control should be focused on the specific action that people are allowed to take and how long it will take them to do it. Human permissions will likely change more often, so tools like Conduktor Console can streamline permissions management and reduce the chance of mistakes being made.

Mask Personally Identifiable Information

While granting developers full access to production data speeds up debugging and resolving issues, it also creates potential compliance risk. If there is PII in a production topic, giving anyone access to the raw data — including developers who need to fix an issue — can violate user privacy laws.

The solution to this is PII masking. Masking the sensitive fields in a data set lets your developers see the information they need to solve a problem while protecting sensitive personal information. This helps ensure development and debugging are not hindered while you remain in compliance with the law.

Manage Encryption and Certificates Properly

Manage encryption and certifications concept illustration

Source: Conduktor

Once you start handling customer data, end-to-end encryption of your data as it moves around your infrastructure becomes a requirement. As your Kafka deployments — and the number and variety of clients accessing them — scale, certificate lifecycle management can get complicated. Tools written in different programming languages or using different key management services (KMSs) can lead to higher integration complexity and result in technical debt.

A common way to address compatibility concerns when encrypting data is to use a standardized library (or set of libraries) for Kafka access across the organization and implement the right encryption choices within that library, making it transparent to the developer. This is where many teams start on their encryption journey, but this approach may create technical debt when library (or libraries) adoption grows.

Another way to handle encryption is to allow any encryption library but mandate the use of middleware, such as a Kafka proxy, to connect to the Kafka cluster. This proxy is an additional architecture piece that you have to maintain, but your team members can use any connection libraries they want without running into compatibility issues.

Safely Share Kafka Data Outside Your Organization

It’s increasingly common to use Kafka for sharing data outside your organization, but that presents new pain points for architects and DevOps teams tasked with protecting their infrastructure and preventing data from being shared incorrectly.

For example, an e-commerce brand may wish to give affiliate partners real-time access to specific parts of sales data while protecting sensitive PII they cannot legally share and sensitive business data they do not want to disclose.

Rather than giving third parties access to internal Kafka infrastructure, you can provide a separate Kafka deployment outside your network replicating only the data to be shared.

Infrastructure architecture concept illustration

Source: Conduktor

However, this brings additional infrastructure overhead (including another set of ACLs to monitor and manage), and replication might cause uncertainty about a single source of truth.

Rather than using a replica, proxying access to the main Kafka cluster(s) is a more robust and flexible approach. The proxy can perform access controls and allow access to only a subset of data as needed, which is always kept up to date.

Shift Your Focus to Security Sooner

In the early stages of developing your application and its supporting infrastructure, you often focus on ease of adoption and convenience. Different teams will need ready access to these aspects while developing and testing their applications and building out their supporting toolchains. It’s usually only as you move toward production with real data that security becomes a priority.

This can lead to big problems in the final stages of development and deployment: code stops working as expected as you introduce security policies, unforeseen security holes are discovered during penetration tests and processes must be reworked to include data access requests and ACL updates. This effect amplifies as your use cases grow and other departments or parties request access to data.

Source: Conduktor

For these reasons, it’s far better to implement your Kafka security policies as early as possible. Securing your data both in motion and at rest streamlines this process, and — along with implementing the Kafka security best practices detailed above — helps protect against common misconfiguration and access problems that lead to data leaks, accidental disclosure and intrusions.

If you are concerned about the security of your Kafka deployments and want to learn about centrally managing them, book a Conduktor demo today.

James White is a Director of Product at Conduktor, where he is focused on building products to equip organizations to effectively manage their Kafka ecosystem. He has worked at the intersection of data and product management for nearly 10 years,...