TNS
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
NEW! Try Stackie AI
CI/CD / Observability / Platform Engineering

Foundational Concepts in Platform Engineering 

The interconnected relationship between engineering principles and product domains are key to strategic alignment to construct a cohesive tech ecosystem.
Mar 27th, 2025 1:00pm by
Featued image for: Foundational Concepts in Platform Engineering 
Image from Denis Belitsky on Shutterstock.
Editor’s note: This article is an excerpt from the Manning Early Access Program (MEAP) book Effective Platform Engineering by Ajay Chankramath, Bryan Oliver, Sean Alvarez and Nic Cheneweth. In MEAP, you read a book chapter by chapter while it’s being written and get the final ebook as soon as it’s finished.

A platform engineering team will work to enable practices in the organization that will increase efficiency while enhancing and maintaining the practices that ensure the “ilities” of software delivery such as:

  • Maintainability
  • Security
  • Scalability
  • Reliability
  • Extensibility
  • Recoverability
The high-level mental model of platform engineering has three essential components: product domains, underlying engineering principles and product management.

The high-level mental model of platform engineering has three essential components: product domains, underlying engineering principles and product management.

What is Platform Engineering?

Platform engineering is a discipline orchestrated by the seamless collaboration of three key elements:

  • Strong product management surrounding what the platform engineers are building
  • An emphasis on finding and maintaining the implementation domains within the product
  • And the consistent application of platform engineering principles.

At its core, product management acts as the visionary architect, shaping the purpose and trajectory of the platform. It forms the backlog of building an engineering platform and prioritizes according to customer needs.

This strategic blueprint guides the specialized efforts of product domains, each contributing their expertise to construct a cohesive technological ecosystem. Engineering principles can include:

  • Identity and access management (IAM)
  • Networking
  • Security
  • Compliance and governance
  • Cloud Runtimes (such as Kubernetes)
  • Observability

Where Engineering and Product Intersect

It’s important to understand the relationship between engineering principles and product domains. Engineering principles embody our overarching goals, such as maintaining loose coupling and enabling independent releases within and between platform teams. Implementing the specific practices to achieve these outcomes, such as domain driven design (DDD), act as the applied methodologies. For instance, preserving flexibility in technology and implementation choices aligns with the practice of evolutionary architecture.

This interconnected relationship between principles and domains underscores the strategic alignment of overarching goals with the tactical application of specific practices to achieve them.

However, the platform engineering team itself will need a good understanding of these principles so that they can be embedded in the design and execution of anything produced. The book Effective Platform Engineering describes these foundational principles and how they fit into a well-designed engineering platform.

Platform Product Management

All platform engineering is done with the rigor of platform product management with a well-defined product life cycle. This is typically referred to as a platform product. The product management lens ensures that the platform you are building does not turn out to be an incoherent collection of automation that is difficult to manage and scale.

A platform engineering team builds out platform product capabilities used by the software developers responsible for the functional code and practicing the DevOps culture.

However, site reliability engineers (SREs) will also use these capabilities as they maintain the code and improve its reliability. By using the same practices to deliver products to external customers, delivering and managing an engineering platform as a product for internal engineering teams will help ensure that the needs of those customers are best being met, increasing adoption and, ultimately, value.

The engineering platform product is divided into eight domains. Each of the domains is built with the six unique engineering principles. It is important for us to understand the principles before we talk about the domains.

Platform Engineering Principles

The essential principles of platform engineering include:

  • Observability
  • Continuous deployment
  • Self-service functionality
  • Compliance and governance
  • Cost and sustainability
  • And security

Learning about these principles is crucial to ensure that platform engineering practices enhance the efficiency, reliability and scalability of software delivery within an organization.

The platform engineering team is assumed to execute like any other development team within the organization. Practices such as “everything defined as code” and continuous testing will be assumed across all practices. In executing platform engineering in the true sense of software engineering, several core principles should be followed.

Expanded view of the principles involved in the platform engineering mental model.

Expanded view of the principles involved in the platform engineering mental model.

Software-defined exists in the middle because it is core to every other concern. The rest of the principles are circular, however, because each of these should continuously be revisited as the others evolve.

For example, as you evolve self-service functionality, you’ll likely need to look at the observability of that — not just for function, but to ensure runtime costs haven’t gone up too high as a result of your users being able to deploy more. That may lead to looking at governance and compliance to keep those costs down.

By keeping each of these principles in the platform team’s core ways of working, the chances of success for any initiative will go up significantly.

Observability

Observability in software development is generally considered a way to measure the state of your system’s internals by inferring what is visible externally. There are times in which developers and teams conflate monitoring and observability.

While monitoring refers to the system’s health, observability focuses on functional correctness. Similarly, while monitoring tells you if the system is operational at a given time, observability tells you whether the system could become unhealthy.

While monitoring is key for any system, software or otherwise, it inherently tells you about a failure and a suboptimal end-user experience, as they would undoubtedly have encountered the issue.

Observability, on the other hand, is about being proactive. It achieves that through its inherent nature of using inferences to develop insights and actions to ensure your end users do not experience the issues.

In our definition of observability for engineering platforms, we look at it far beyond typical observability for applications and infrastructure. Instead, we look at it across multiple axes such as portfolio of applications, platforms, cloud health, incidents, service health and most importantly, business operations.

Continuous Deployment via Pipeline (CI/CD)

Continuous integration/continuous delivery (CI/CD) is a set of practices and techniques in a typical software development life cycle. CI/CD is used for integrating the software written by multiple developers and teams frequently and continuously with the idea of generating faster feedback loops and ensuring that the software written by individual developers work together as a product functionally.

While these terms (CI and CD) are usually referred to together, the functions refer to two parts of the software build and delivery process. CI focuses on integrating the code changes in one repository, while CD refers to delivering software to various environments with appropriate gating as defined by the development teams.

Continuous deployment, the automated process of releasing the changes to production, is often referred to in this context.

Self-Service Functionality

Self-service functionality should be an intrinsic trait of all capabilities the platform exposes. In a small startup organization, everyone has access to and the ability to change/modify/deploy any app or environment. In this case, decoupling engineers from systems has no value because there is no ticket system we are trying to get away from.

The decoupling becomes valuable as teams and responsibilities mature, and the org evolves into submitting requests or tickets to an emerging “DevOps-like” team. Self-service APIs should be developed to eliminate roadblocks to usage by autonomously executing teams.

Compliance and Governance

Compliance and Governance in platform engineering is approached differently than standard enterprise. One of the goals of platform engineering is to enable teams to work autonomously so that the platform team does not become a bottleneck on their ability to deliver software.

As one might expect, this goal directly opposes the goals of your compliance, security, or governance team. Thus, when we discuss compliance and governance here, we will approach it from an enabling perspective, always looking for ways to align the goals of these teams (platform and compliance) and reduce friction for the platform’s developers and users.

We do this by instrumenting automated capabilities in the tools and systems provided to teams and users of the platform. Teams should have complete control and administrative rights over their process and how they self-verify and govern their software.

The platform is not responsible for compliance and governance while writing software. This allows the platform team to focus on compliance verification as opposed to prescribing how compliance is done for each team. Compliance at the point of change allows the platform team to verify software as it enters the environment and removes that same team from the processes and procedures with which individual teams developed that same software.

Cost and Sustainability

Cost and sustainability in platform engineering are the principles that ensure individual developer responsibility is enabled at all possible steps within the software development life cycle. This starts with provisioning the resources to get your application developed and being integrated and built, followed by getting the application tested and deployed. This should be a matter of time about optimizing the costs after building and deploying your applications instead of having the developer access to the right capabilities to build it right in the first place.

Cost considerations in platform engineering are applied both from the ideas around cost optimization (referred to as “cloud cost optimization” these days) as well as FinOps (which breaks down the silos and reduces friction in using the cloud resources in a cost-efficient manner to improve the ability of the business to scale), principles set forth by the FinOps Foundation. We expect this to be applied in all the possible domains and do not see it as something independent of your platform engineering ecosystem.

Sustainability is a closely related principle to cost management but focuses on sustainable and environmentally responsible practices when choices are made as the developers go through crafting their products. By providing environmentally responsible, sustainable options through platform capabilities, the developers can choose them as their sensible default, reducing the need to use less-viable options inadvertently.

All developers will make a more responsible choice when given a choice. However, this must be a choice exposed through platform capabilities, as the architecture of the product they are building and the related scaling and performance requirements eventually dictate the right decisions.

Security

The approach to security is similar to that of compliance and governance. Teams developing on the platform are responsible for continuously verifying their code is secure while developing and deploying it.

The platform team is then only responsible for verifying that work has been done via compliance at the point of change and that the software added to the environment is secure.

The platform team applies the same principles to development teams regarding the platform’s security. This means applying security checks to the platform team’s pipelines that allow for continuous security verification during the writing of new platform code and verification of that security at the boundary of the environment or deployment of the changes to the platform.

While network rules are helpful in specific situations, they need to provide a complete security strategy and reliance on them leads to friction between development and platform teams.

Group Created with Sketch.
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.