Beyond Terraform: How We Scaled DevOps
This is the first of two parts. Read Part 2: “How Simplifying Our Architecture Saved Us Thousands Monthly”
Being a CTO in a burgeoning tech company is akin to juggling burning torches. The onus of being agile, fast and secure can often overwhelm compact teams, especially because of the intricacies of cloud development. In 2023, my company Drop Bio Health, which offers at-home health tracking through blood biomarkers and lifestyle analysis, decided to simplify cloud development to focus on core product value development.
I had been reconsidering our approach to DevOps to improve team efficiency, increase our deployment frequency and further fortify security protocols. Here’s the story of the lessons we learned on this journey to overcome DevOps challenges and set up ourselves for scalable growth.
Terraform Experiment: A Detour That Costed Time and Focus
As many teams do, we first ventured down the road with Terraform, in conjunction with AWS Cloud Development Kit (CDK), hoping it would be the silver bullet for our cloud infrastructure woes. This journey, however, introduced a slew of unforeseen DevOps complexities that not only consumed an unsustainable amount of our team’s time, but also diverted our attention from core product development.
Terraform’s declarative nature means that everything, including resources and their configurations, must be explicitly defined. As our architecture grew, so did the lines of code — running into the thousands. This expansive codebase became increasingly challenging to manage.
Moreover, maintaining a dedicated project solely for infrastructure meant we found ourselves juggling between two major projects: one focused on our application and another purely on infrastructure. This bifurcation created unnecessary silos within our team, and posed challenges in ensuring our application and infrastructure changes remained synchronized.
Testing Terraform presented its unique set of challenges. Given its configuration-oriented nature and lack of a straightforward testing framework, testing was more akin to pattern matching than conventional logic verification. As a result, debugging was less about problem-solving and more an exercise in detective work.
As our Terraform codebase expanded, it was clear that segmentation was necessary for manageability. However, this seemingly prudent decision to split configurations across multiple files and directories made tracing dependencies and gaining a holistic understanding even more taxing. In reflection, our journey into the world of Terraform, while initiated with optimism, gradually became a quagmire of complexities. We needed a change so we could spend more of our invaluable time and resources on our business goals instead of manual DevOps tasks.
Enter: Infrastructure Automation
While we were working hard to offset these complexities, the market was evolving at breakneck speed, and any delays might have meant missed opportunities. In this environment the weight of managing both development and operations was beginning to strain our team.
I started to look at new approaches to infrastructure, searching for automation to solve the challenges we were facing with Infrastructure as Code tools like CDK and Terraform. The latest innovations with Infrastructure from Code were an ideal fit to help our small team move quickly and efficiently. We used Nitric (check out the open source project on GitHub), which automatically provisions the required infrastructure based on our code, and provides opinionated best practices to help us be productive and confident with cloud deployment.
Our resulting infrastructure and deployment processes are significantly less complex, leaving our team free to focus on developing our core business features and deploying on demand. The automation Nitric provides means we can scale DevOps while bypassing common challenges with Terraform.
As part of the implementation of Nitric, we also revisited our application architecture. This led to a number of improvements in scalability, security and cost (a 60% reduction in AWS-hosting costs). In the second part, I’ll share the architecture lessons we learned and the outcomes we achieved.
You can also read more about our success with this project in this article.