Software Development

The Thermo­dynamics of Software Entropy: Why All Code Tends Toward Disorder

The second law of thermodynamics states that entropy in a closed system never decreases. Software is not exempt. This is an investigation into why disorder is not a failure of discipline — it is a structural inevitability, and what follows from accepting that.

In 1850, Rudolf Clausius wrote the sentence that would eventually become the second law of thermodynamics: in any natural process, the total entropy of an isolated system can only increase or remain constant — it can never decrease. He was describing heat engines. But he was, without knowing it, describing every software project ever written.

Software entropy is not a metaphor. Or rather — it begins as a metaphor, but the further you follow it, the more it starts to look like something deeper: a structural law about the relationship between complexity, time, and the cost of maintaining order against a universe that doesn’t care about your architecture.

The goal here is not to be fatalistic. It is to be precise. Because understanding why disorder accumulates — not just that it does, but the mechanism — is what makes it possible to do something useful about it. And understanding where the thermodynamic analogy holds, and where it breaks down, is what separates clear thinking from a satisfying but ultimately decorative metaphor.

1. PhysicsWhat Thermodynamic Entropy Actually Is

Before applying it to software, it is worth being honest about what thermodynamic entropy actually says — because popular accounts get it subtly wrong in ways that matter for the analogy. Entropy is not “disorder” in some vague colloquial sense. It is a precise statistical quantity: the number of microscopic configurations of a system that are consistent with its observable macroscopic state. High entropy means many possible microstates. Low entropy means few.

A crystal of ice has low entropy: the molecules are arranged in a precise lattice, and there are very few configurations consistent with that macroscopic structure. A glass of water has higher entropy: there are vastly more ways the molecules could be arranged while still being “liquid water at 20°C.” When ice melts, the number of available microstates explodes — entropy increases. This process is irreversible not because it is physically impossible to reassemble the lattice, but because the overwhelmingly vast majority of possible microstates correspond to liquid water, not ice. Probability, not prohibition.

Second Law of Thermodynamics

Clausius Inequality (1850)

In any isolated system, the total entropy S can only increase or remain constant over time. Formally: dS/dt ≥ 0. The natural direction of all spontaneous processes is toward the macrostate accessible by the greatest number of microstates — toward maximum probability, not minimum energy.

Crucially: order is not prohibited by physics. It is simply astronomically improbable without continuous energy input. A refrigerator creates local order (cold things) at the cost of greater disorder elsewhere (heat expelled into the room). You can fight entropy locally. You cannot win globally.

This is the key insight that carries over to software: disorder is the default not because it is inevitable in some absolute sense, but because there are vastly more disordered states than ordered ones. Every random change to a codebase — every hurried fix, every context-switching engineer, every feature added without refactoring — is far more likely to land in a disordered state than an ordered one, simply because ordered states are rare and disordered states are numerous. Fighting that probability gradient is possible, but it costs energy. Stop spending that energy, and the system drifts toward its equilibrium: maximum entropy.

2. Lehman’s Laws — The Empirical Foundation

The thermodynamic framing is intellectually satisfying, but software entropy is not merely analogical. It has been empirically studied. The most rigorous body of work is Lehman’s Laws of Software Evolution, formulated by Meir M. Lehman beginning in 1974 at IBM and later at Imperial College London, based on empirical observation of industrial software systems including IBM’s OS/360.

Lehman distinguished three categories of software. S-type programs are derived from a complete, fixed specification — they can in principle be proven correct and do not need to evolve. P-type programs solve a specific real-world problem and may be updated, but their problem domain is stable. E-type programs — evolutionary programs — are embedded in the real world, which itself changes, forcing the software to change with it. His laws apply specifically to E-type systems, which are the overwhelming majority of non-trivial production software.

“The moment you install that program, the environment changes.”— Meir M. Lehman, on why E-type programs cannot be fully specified in advance

Lehman’s Second Law — Increasing Complexity — states that as an E-type system evolves, its complexity increases unless work is done to maintain or reduce it. This is the software entropy statement in its most precise empirical form. Notice what it says and what it doesn’t. It does not say complexity must increase. It says it does so unless explicit countervailing work is done. The second law of thermodynamics is lurking there: you can create local order, but it costs energy, and without that energy input, the default is increasing disorder.

Equally telling is Lehman’s Seventh Law — Declining Quality — which states that the quality of an E-type system will appear to be declining unless it is rigorously maintained and adapted to operational environment changes. Note the word rigorously. Casual maintenance is not enough. The system’s environment is always drifting away from the assumptions baked into the original design — security threat models evolve, dependencies age out, performance baselines shift, API contracts change. Maintaining quality requires not just keeping the code clean but actively pursuing the widening gap between the system’s model of the world and the world itself.

Lehman’s Eight Laws — Empirical Validation Status

Validation status based on cross-study analysis. Law I (Continuing Change) and Law VI (Continuing Growth) are validated by all studies. Laws II, IV, V show mixed results in open-source contexts. Source: Evolution of the Laws of Software Evolution (2013) and microservices.io analysis.

3. Why Disorder Accumulates — The Real Mechanisms

The thermodynamic and Lehmannic accounts tell us that complexity increases. But to do anything useful with that knowledge, you need to understand the actual mechanisms by which it happens. There are several, and they compound each other.

3.1 The Combinatorial Explosion of Coupling

When a codebase is new and small, the number of potential interactions between components is manageable. But the number of potential interactions between N components scales as N² in a fully connected system. Technical debt is the add-on difference in complexity between the ideal and real-world solutions — and poorly designed abstractions that are uncompressed representations of state generate more complexity, because their interfaces are inherently complex where they should instead be simple and hide deep complexity. Every new component added to a tightly coupled system doesn’t add linear complexity — it adds complexity proportional to the number of existing things it could interact with. This is the square-law growth pattern that Lehman observed empirically in the OS/360 data, and it is purely structural: it follows from the combinatorics of connections, not from any failure of engineering discipline.

3.2 The Entropy of Team Heterogeneity

Shannon entropy is a measure of information uncertainty. Hassan and Holt applied Shannon entropy to software evolution by defining source code change entropy: for a software system composed of source files, the entropy of a period in its evolution is defined as the informational disorder of the distribution of changes across files — the more scattered across the codebase changes are, the harder they are for developers to recall, leading to diminished comprehension of the system. This is a genuinely important insight: entropy in code is not merely a matter of code quality. It is a property of the change history. A file that is modified frequently, by many different developers, in small scattered increments, is more entropically loaded than one modified rarely and cohesively — even if both files look equally clean at any given snapshot.

Furthermore, developers can’t read each other’s minds, and they have different ways of writing code — their own style. No matter how rigid the code review and code style practices are, there will always be a divergence of quality, style, and complexity of each iteration. This is the social dimension of software entropy: teams are not a single mind. They are collections of individuals with different mental models, different coding styles, and different intuitions about the right abstraction. Every handoff, every new hire, every rotation introduces a new perspective that is subtly inconsistent with the existing mental model embedded in the code. It’s why Brook’s Law holds: “adding people to a late software project makes it later” because it increases the energy — or chaos — in a system where chaos is already high.

3.3 The Ratchet of Environmental Drift

A program that is expected to operate in the real world cannot be fully specified, for two reasons: it is impossible to anticipate all the complexities of the real-world environment in which it will run, and equally importantly, the program will affect the environment the moment it starts being used. This is perhaps the most underappreciated source of software entropy. The world your software models is always in motion. Tax laws change. Security threat landscapes shift. Dependencies release breaking changes. User expectations evolve. Operating systems introduce subtle behavioural differences. Your architecture was designed for the world as it existed at a moment in time that is already past. The accumulating gap between that frozen model and the living world is an entropy gradient — and every patch applied to close part of that gap without redesigning the underlying model adds another layer of accretion.

Hassan-Holt Metric (2009)

For a software system S composed of files f₁, f₂, …, fₙ, the entropy H of a period of evolution is defined using Shannon’s formula: H = -Σ p(fᵢ) × log₂(p(fᵢ)), where p(fᵢ) is the proportion of changes affecting file fᵢ. Maximum entropy occurs when changes are uniformly distributed across all files — maximum disorder in the change process. Minimum entropy (more maintainable) occurs when changes are concentrated in a few cohesive files.

Research by Keenan et al. (PROFES 2022) empirically confirmed that non-refactoring changes significantly increase file entropy over time, and that refactoring interventions measurably decrease it — but that the natural drift is always toward higher entropy without active intervention.

4. Is Perfect Architecture Thermodynamically Impossible?

This is the question the analogy ultimately forces us to confront, and it deserves a careful answer rather than a rhetorical one. The thermodynamic framing suggests that “perfect architecture” — a codebase that stays clean and maintainable indefinitely without ongoing investment — is not merely unlikely but physically impossible for any non-trivial E-type system. Is that right?

The argument runs as follows. A “perfect” architecture is one with very low entropy — a specific, ordered configuration among the vast space of possible configurations. Maintaining that ordered state requires constant energy input to push back against the entropic gradient. The environment changes continuously, which means the requirements for what “ordered” even means keep shifting. A design that was perfectly adapted to the world of 2018 is not perfectly adapted to the world of 2026 — and the work of re-adapting it is not zero. If you stop doing that work, entropy increases. If you never stop, you are in a perpetual motion machine of software maintenance — which is, in fact, exactly what every healthy engineering organization is.

The Perfect Architecture Impossibility

A perfectly ordered architecture in a changing environment is analogous to a perfectly crystalline structure in a warming room. It is not prohibited — it is just thermodynamically unstable. Maintaining it requires continuous energy input proportional to the rate of environmental change. When that input stops, the system drifts toward equilibrium — which is not the crystalline state. Architecture is not a noun. It is a verb.

However, the analogy has an important limit here that honest analysis requires acknowledging. In physics, entropy can only increase in a closed system. Software systems are not closed. They can be redesigned from the outside. A greenfield rewrite, a radical refactoring, a domain model replacement — these are not possible for a steam engine once it is running, but they are possible for software. The open-system nature of software is what makes it, in principle, more tractable than thermodynamics allows.

Nevertheless, the practical reality is that most software systems are never actually rewritten. There is no “build it once and it’s done” — software requires constant maintenance. Home ownership is a good metaphor: leave a house for half a year without maintenance and it becomes uninhabitable. The option to rewrite exists but is rarely exercised, because the cost is high, the risk is high, and the business case is hard to make to stakeholders who do not experience the accumulated entropy directly. So in practice, most software systems behave like closed systems — the entropy accumulates, and the correction never comes.

Entropy Accumulation Rate vs. Refactoring Investment

Conceptual model of system complexity over time under three maintenance regimes. Based on empirical patterns observed in Keenan et al. (2022) and Stepsize technical debt research (2022).

5. Where the Metaphor Breaks Down

Intellectual honesty requires confronting the limits of the analogy, because treating it as more than an analogy leads to a kind of fashionable fatalism that is not useful. The second law of thermodynamics is a universal physical law with no known exceptions. Software entropy is an empirical tendency with known exceptions and known countermeasures. The distinction matters.

For one thing, empirical studies of open-source software have produced mixed results on Lehman’s Laws. Research on the Linux kernel found that the superlinear growth pattern in complexity stopped with release 2.5 — from that release on, growth has been linear. Laws II (increasing complexity), IV (conservation of organizational stability), and V (conservation of familiarity) have been invalidated for most open-source cases studied. The implication is significant: the open-source model — distributed, decentralized, cross-functional, with high committer longevity and architectural meritocracy — appears to be genuinely more resistant to entropy accumulation than centralized hierarchical development models. Architecture, process, and culture are real countermeasures. The universe is not indifferent to your coding standards; it just doesn’t enforce them for you.

The Key Disanalogies

Holds: The direction of spontaneous change (toward disorder) in the absence of maintenance energy. The cost of creating local order (good architecture requires work). The statistical predominance of disordered states over ordered ones in the space of possible code changes.

Breaks down: Software systems are open, not closed — radical restructuring is possible. Entropy can be genuinely reversed by refactoring, not just slowed. The “microstates” of software are not objective — they depend on human cognitive models of what “order” means, which change over time. And unlike heat, software disorder is not evenly distributed: architectural debt concentrates in specific components, not uniformly across the system.

6. Practical Consequences

The practical consequence of taking software entropy seriously — not as metaphor but as structural tendency — is a different way of thinking about architectural investment. Maintenance is not a tax on feature development. It is the energy input that keeps a non-equilibrium system from collapsing toward its equilibrium state. Framing it as overhead is like calling refrigeration “a tax on temperature.” Without it, you get warm milk.

Furthermore, the entropy framing gives you a way to reason about the rate of decay. A system whose environment is changing rapidly (a consumer product in a competitive market) is a system under high entropic pressure — the gap between the current model and the required model is widening quickly. That system needs proportionally more maintenance investment just to stay still, let alone improve. A system whose environment is stable (a batch processing tool with fixed inputs and outputs) is under low entropic pressure — the maintenance cost is low and the architecture can stay clean with minimal investment. The growth rate of software entropy is directly correlated to the growth rates of technology, software markets, and companies — all of which are growing fast. This is not a coincidence. It is the thermodynamic gradient at work.

The Refrigerator Principle of Software Architecture

You cannot prevent entropy in an open system. But you can build an efficient refrigerator. The engineering question is not “how do I eliminate complexity?” but “what is the minimum energy input required to maintain the level of order my system needs?” That reframing shifts the conversation from architectural idealism to architectural thermodynamics — from “this code should be clean” to “this code is worth this much maintenance investment at this rate of environmental change.”

Entropy SourceThermodynamic AnalogueEntropy RatePrimary Countermeasure
Feature accretion without refactoringMolecules added to a containerHighScheduled refactoring, boy scout rule
Team rotation / knowledge silosThermal diffusion across interfacesHighDocumentation, pair programming, ADRs
Environmental drift (API/OS/lib changes)Boundary condition shift in physical systemMediumDependency hygiene, abstraction layers
Coupling growth (N² problem)Molecular interaction explosionHighDomain-driven design, module boundaries
Rushed patches / workaroundsIrreversible adiabatic processMediumPaying tech debt immediately after hotfixes
Undocumented design decisionsInformation loss in compressionLow (but compounding)Architecture Decision Records (ADRs)

7. What We’ve Learned

Software entropy is not a metaphor that decorates a real phenomenon — it is a precise structural tendency grounded in both empirical observation and information theory. Lehman’s Second Law of software evolution — that complexity increases unless active countervailing work is done — is the empirical statement of the second law applied to code. Shannon-derived source code change entropy provides a quantitative metric: the more scattered changes are across a codebase, the higher the informational disorder, and the harder the system becomes to maintain.

The mechanisms are structural, not disciplinary. Coupling grows quadratically. Teams are cognitively heterogeneous. The environment drifts continuously away from the assumptions baked into the original design. Every one of these forces pushes toward higher entropy. The probability gradient overwhelmingly favours disorder over order, just as it does in thermodynamics — not because order is forbidden, but because there are vastly more disordered states than ordered ones.

“Perfect” architecture is not a destination — it is a thermodynamically unstable state that requires continuous energy input to maintain. That energy input is maintenance, refactoring, documentation, and architectural governance. The question is never “have we reached a good architecture?” It is always “are we spending the right amount of energy to keep entropy at an acceptable level, given the rate at which our environment is changing?” Architecture is a verb, not a noun. The second law doesn’t care about your sprint velocity. But your engineering organisation has to.

Eleftheria Drosopoulou

Eleftheria is an Experienced Business Analyst with a robust background in the computer software industry. Proficient in Computer Software Training, Digital Marketing, HTML Scripting, and Microsoft Office, they bring a wealth of technical skills to the table. Additionally, she has a love for writing articles on various tech subjects, showcasing a talent for translating complex concepts into accessible content.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Back to top button