How to Build a Case Converter Tool Using HTML, CSS, and JavaScript

Bansidhar Kadiya — Fri, 12 Jun 2026 07:51:10 +0000

If you're looking to level up your front-end development skills by building a practical web utility, this is the guide for you.

We'll code a fully functional Case Converter Tool from scratch using only HTML, CSS, and vanilla JavaScript.

This lightweight application allows users to paste their content and immediately transform it into standard formats like UPPERCASE, lowercase, Title Case, and Sentence case.

Alongside the text formatting, we'll integrate a live character counter and set up functionality to export the final text as a PDF or Word document.

Grab your favorite code editor, and let's dive in.

Prerequisites

Before you begin, you should have a basic familiarity with the following tools and concepts:

Core Web Technologies: A fundamental understanding of HTML structure, basic CSS styling, and JavaScript concepts like functions, array methods, and string manipulation.
Development Environment: A code editor installed on your computer (for example, Visual Studio Code) and a modern web browser to test your application locally.

Step 1: Set Up Your Project
Step 2: Build the HTML Structure
Step 3: Style the Tool with CSS
Step 4: Add JavaScript Functionality
Step 5: Test Your Tool
Conclusion

Step 1: Set Up Your Project

Before writing any code, you need to establish a clean directory structure for your application files.

First, you'll need to initialize a workspace. Open your file manager and create a brand new directory to keep your work organized. Let's name this directory case-converter-app.

Then you'll generate the required files. Inside your newly created directory, set up the following three blank files:

index.html
styles.css
script.js

Step 2: Build the HTML Structure

Open the index.html file in your code editor. You'll add the structural foundation of the tool here.

Add the following code into your index.html file:




    
    
    Case Converter Tool
    
    
    
    
    
    



    
        
        
            
                💡 Tip: Use Download buttons to save results
            
            
        
        
        
        
            
            
            
            
            
            
            
        

        

        
        
            
            
            
            
        

        
        
            
                0
                Characters
            
            
                0
                Words
            
            
                0
                Paragraphs
            
            
                0
                Sentences

Understanding this HTML:

: This links to an external library that allows JavaScript to generate PDF files directly in the user's browser.
</code>: This creates the main text box where users will paste their content.</p> </li> <li><p><code><div class="stats-panel"></code>: This section contains <code>span</code> elements with unique IDs. You'll target these IDs with JavaScript to update the text statistics in real-time.</p> </li> </ul> <h2 id="heading-step-3-style-the-tool-with-css">Step 3: Style the Tool with CSS</h2> <p>Next, you'll give the tool a clean, professional design. Open your <code>styles.css</code> file and add the following code:</p> <pre><code class="language-css">* { margin: 0; padding: 0; box-sizing: border-box; font-family: 'Inter', sans-serif; } body { background: linear-gradient(135deg, #e0eafc 0%, #cfdef3 100%); min-height: 100vh; display: flex; justify-content: center; align-items: center; padding: 2rem; color: #1e293b; } .app-container { background: #ffffff; width: 100%; max-width: 900px; border-radius: 24px; box-shadow: 0 20px 40px rgba(0,0,0,0.08); padding: 2.5rem; } .textarea-header { display: flex; justify-content: flex-end; margin-bottom: 0.5rem; } .tip-badge { background: #fef08a; color: #854d0e; padding: 0.35rem 0.85rem; border-radius: 20px; font-size: 0.75rem; font-weight: 600; } textarea { width: 100%; height: 220px; padding: 1.5rem; border: 2px solid #e2e8f0; border-radius: 16px; font-size: 1rem; resize: vertical; outline: none; transition: all 0.3s ease; background: #f8fafc; } textarea:focus { border-color: #007bff; background: #fff; box-shadow: 0 0 0 4px rgba(0, 123, 255, 0.1); } .button-grid { display: flex; flex-wrap: wrap; gap: 0.75rem; margin-top: 1.5rem; } button { padding: 0.75rem 1.25rem; border: none; border-radius: 12px; font-size: 0.875rem; font-weight: 600; cursor: pointer; transition: all 0.2s ease; } .case-btn { background: #f1f5f9; color: #475569; border: 1px solid #e2e8f0; } .case-btn:hover { background: #e2e8f0; } /* The active class highlights the selected button */ .case-btn.active { background: #007bff; color: #fff; border-color: #007bff; box-shadow: 0 4px 12px rgba(0, 123, 255, 0.25); } .divider { height: 1px; background: #e2e8f0; margin: 1.5rem 0; } .action-btn { background: #fff; border: 1px solid #cbd5e1; } .action-btn:hover { background: #f8fafc; border-color: #94a3b8; } .primary-action { background: #007bff; color: #fff; border-color: #007bff; } .primary-action:hover { background: #0056b3; border-color: #0056b3; } .danger-action { color: #ef4444; border-color: #fca5a5; background: #fef2f2; } .danger-action:hover { background: #fee2e2; border-color: #f87171; } .stats-panel { display: grid; grid-template-columns: repeat(auto-fit, minmax(130px, 1fr)); gap: 1rem; margin-top: 2rem; background: #f8fafc; padding: 1.5rem; border-radius: 16px; border: 1px solid #e2e8f0; } .stat-box { display: flex; flex-direction: column; align-items: center; } .stat-value { font-size: 1.75rem; font-weight: 700; } .stat-label { font-size: 0.75rem; color: #64748b; text-transform: uppercase; } </code></pre> <p>Understanding this CSS:</p> <ul> <li><p><code>body</code>: You use Flexbox to center the tool perfectly on the screen and apply a soft gradient background.</p> </li> <li><p><code>.app-container</code>: This creates a white, rounded card with a soft shadow to hold the user interface.</p> </li> <li><p><code>.case-btn.active</code>: You define an active state here. You'll use JavaScript to apply this class to the specific button the user clicks.</p> </li> </ul> <p>At this stage, we've completely structured and styled the user interface. The tool will look like this:</p> <p>Right now, the front-end is visible, but the buttons are entirely static. To make the transformations actually work, we have to write the logic in JavaScript.</p> <h2 id="heading-step-4-add-javascript-functionality">Step 4: Add JavaScript Functionality</h2> <p>Now you need to make the tool interactive. Open the <code>script.js</code> file and add this code:</p> <pre><code class="language-javascript">const textArea = document.getElementById('inputText'); // Listen for typing to update statistics in real-time textArea.addEventListener('input', updateStats); function updateStats() { const text = textArea.value; document.getElementById('charCount').textContent = text.length; const words = text.trim().split(/\s+/).filter(word => word.length > 0); document.getElementById('wordCount').textContent = words.length; const sentences = text.split(/[.!?]+/).filter(sentence => sentence.trim().length > 0); document.getElementById('sentenceCount').textContent = sentences.length; const paragraphs = text.split(/\n+/).filter(paragraph => paragraph.trim().length > 0); document.getElementById('paragraphCount').textContent = paragraphs.length; } function convertCase(event, type) { let text = textArea.value; if (!text) return; // Highlight the active button const buttons = document.querySelectorAll('.case-btn'); buttons.forEach(btn => btn.classList.remove('active')); if (event) { event.target.classList.add('active'); } // Process the text switch (type) { case 'upper': text = text.toUpperCase(); break; case 'lower': text = text.toLowerCase(); break; case 'capitalized': text = text.toLowerCase().replace(/\b\w/g, c => c.toUpperCase()); break; case 'title': const minorWords = ['a', 'an', 'the', 'and', 'but', 'or', 'for', 'nor', 'on', 'at', 'to', 'from', 'by']; text = text.toLowerCase().split(' ').map((word, index) => { if (index !== 0 && minorWords.includes(word)) return word; return word.charAt(0).toUpperCase() + word.slice(1); }).join(' '); break; case 'sentence': text = text.toLowerCase().replace(/(^\s*\w|[\.\!\?]\n*\s*\w)/g, c => c.toUpperCase()); break; case 'inverse': text = text.split('').map(c => c === c.toUpperCase() ? c.toLowerCase() : c.toUpperCase()).join(''); break; case 'alternate': text = text.toLowerCase().split('').map((c, i) => i % 2 === 0 ? c : c.toUpperCase()).join(''); break; } textArea.value = text; updateStats(); } function copyToClipboard() { if (!textArea.value) return; textArea.select(); document.execCommand('copy'); const copyBtn = document.querySelector('.copy-btn'); copyBtn.textContent = 'Copied!'; setTimeout(() => copyBtn.textContent = 'Copy To Clipboard', 1500); } function clearText() { textArea.value = ''; updateStats(); document.querySelectorAll('.case-btn').forEach(btn => btn.classList.remove('active')); } function downloadWord() { if (!textArea.value) return; const blob = new Blob([textArea.value], { type: 'application/msword' }); const url = URL.createObjectURL(blob); const a = document.createElement('a'); a.href = url; a.download = 'converted_text.doc'; a.click(); URL.revokeObjectURL(url); } function downloadPDF() { if (!textArea.value) return; const { jsPDF } = window.jspdf; const doc = new jsPDF(); const splitText = doc.splitTextToSize(textArea.value, 180); doc.text(splitText, 15, 15); doc.save('converted_text.pdf'); } </code></pre> <p>Understanding this JavaScript:</p> <ul> <li><p><code>addEventListener('input', ...)</code>: This listens to every single keystroke. Every time you type, it instantly recalculates the words, characters, and sentences.</p> </li> <li><p><code>convertCase(event, type)</code>: This function takes the selected style (like <code>upper</code> or <code>sentence</code>) and applies Regular Expressions (Regex) or array mapping to format the string. It also dynamically adds the <code>.active</code> CSS class to the specific button you clicked.</p> </li> <li><p><code>document.execCommand('copy')</code>: This is a browser command that copies the selected text directly to the user's clipboard.</p> </li> <li><p><code>new Blob()</code>: You use a Blob (Binary Large Object) to construct a file out of the text on the fly. This allows users to download a <code>.doc</code> file without needing a backend server.</p> </li> </ul> <h2 id="heading-step-5-test-your-tool">Step 5: Test Your Tool</h2> <p>You're now ready to evaluate your code in a real browser environment.</p> <ol> <li><p>Open the <code>case-converter-app</code> folder on your computer.</p> </li> <li><p>Double-click the <code>index.html</code> file to launch the application.</p> </li> <li><p>Paste a long paragraph into the text area to verify that the live statistics update accurately.</p> </li> <li><p>Switch between the formatting options to observe the immediate DOM manipulation, and test the export buttons to ensure files are downloading correctly.</p> </li> </ol> <h2 id="heading-conclusion">Conclusion</h2> <p>In this tutorial, you successfully engineered a browser-based Case Converter Tool using vanilla JavaScript.</p> <p>You learned how to handle continuous user inputs, manipulate string data using Regular Expressions, and trigger local file downloads directly from the front end.</p> <p>Most importantly, you learned that modern web browsers are highly capable of handling complex document modifications locally, removing the strict need for external backend servers. This method guarantees fast processing speeds and keeps user data completely private.</p> <p>For a live demonstration of these concepts in a production environment, feel free to test out this <a href="https://99tools.net/case-converter/">Case Converter</a> and experience how seamlessly these text transformations operate.</p> </article> <article> <h1> From Flutter to Backend: How to Build Production-Grade REST APIs with Dart and Dart Frog </h1> <p>Oluwaseyi Fatunmole — Fri, 12 Jun 2026 00:39:23 +0000</p> <p>Dart backend frameworks exist on a spectrum. At the <a href="https://www.freecodecamp.org/news/how-to-build-and-ship-production-rest-apis-with-dart-and-shelf/">minimal end sits Shelf,</a> with raw primitives and full control. You wire everything yourself. <a href="https://www.freecodecamp.org/news/how-to-build-production-grade-rest-apis-with-dart-and-serverpod/">At the maximal end sits Serverpod</a>. It's a full framework with code generation and opinionated conventions. The framework makes most structural decisions for you.</p> <p>Dart Frog lives in the middle, and for many Flutter engineers, it's the most natural fit.</p> <p>Dart Frog is a fast, minimalistic backend framework built on top of Shelf, originally created by Very Good Ventures and now maintained independently. It takes the file-based routing model popularized by Next.js and Remix, applies it to Dart, and wraps it with a clean CLI that handles development server, hot reload, production builds, and Docker generation, all out of the box.</p> <p>You write a Dart file in the routes/ directory, export an onRequest function, and Dart Frog handles the routing automatically. No router configuration, no handler registration, no mounting. The file system is the router.</p> <p>In this article, we'll build a User and Profile Management REST API (the same one we built in the linked articles above) using Dart Frog, connect it to PostgreSQL, add JWT authentication, and deploy it to Fly.io.</p> <p>By the end you'll understand Dart Frog's routing model deeply, and you'll have a clear picture of where it fits compared to Shelf and Serverpod.</p> <h2 id="heading-table-of-contents">Table of Contents</h2> <ul> <li><p><a href="#heading-prerequisites">Prerequisites</a></p> </li> <li><p><a href="#heading-how-dart-frog-differs-from-shelf-and-serverpod">How Dart Frog Differs from Shelf and Serverpod</a></p> </li> <li><p><a href="#heading-installing-dart-frog">Installing Dart Frog</a></p> </li> <li><p><a href="#heading-creating-the-project">Creating the Project</a></p> </li> <li><p><a href="#heading-understanding-the-project-structure">Understanding the Project Structure</a></p> </li> <li><p><a href="#heading-dart-frog-core-concepts">Dart Frog Core Concepts</a></p> <ul> <li><p><a href="#heading-file-based-routing">File-Based Routing</a></p> </li> <li><p><a href="#heading-the-requestcontext">The RequestContext</a></p> </li> <li><p><a href="#heading-middleware-and-dependency-injection">Middleware and Dependency Injection</a></p> </li> <li><p><a href="#heading-dynamic-routes">Dynamic Routes</a></p> </li> </ul> </li> <li><p><a href="#heading-setting-up-the-database">Setting Up the Database</a></p> <ul> <li><p><a href="#heading-docker-compose-for-postgresql">Docker Compose for PostgreSQL</a></p> </li> <li><p><a href="#heading-environment-configuration">Environment Configuration</a></p> </li> <li><p><a href="#heading-database-connection-manager">Database Connection Manager</a></p> </li> <li><p><a href="#heading-migrations">Migrations</a></p> </li> </ul> </li> <li><p><a href="#heading-defining-the-models">Defining the Models</a></p> </li> <li><p><a href="#heading-building-the-repositories">Building the Repositories</a></p> <ul> <li><p><a href="#heading-user-repository">User Repository</a></p> </li> <li><p><a href="#heading-profile-repository">Profile Repository</a></p> </li> </ul> </li> <li><p><a href="#heading-authentication-service">Authentication Service</a></p> </li> <li><p><a href="#heading-middleware">Middleware</a></p> <ul> <li><p><a href="#heading-database-middleware">Database Middleware</a></p> </li> <li><p><a href="#heading-auth-middleware">Auth Middleware</a></p> </li> <li><p><a href="#heading-error-middleware">Error Middleware</a></p> </li> </ul> </li> <li><p><a href="#heading-building-the-routes">Building the Routes</a></p> <ul> <li><p><a href="#heading-auth-routes">Auth Routes</a></p> </li> <li><p><a href="#heading-user-routes">User Routes</a></p> </li> <li><p><a href="#heading-profile-routes">Profile Routes</a></p> </li> </ul> </li> <li><p><a href="#heading-wiring-the-middleware-pipeline">Wiring the Middleware Pipeline</a></p> </li> <li><p><a href="#heading-testing-the-api">Testing the API</a></p> </li> <li><p><a href="#heading-deployment">Deployment</a></p> <ul> <li><p><a href="#heading-production-build">Production Build</a></p> </li> <li><p><a href="#heading-deploying-to-flyio">Deploying to Fly.io</a></p> </li> </ul> </li> <li><p><a href="#heading-conclusion">Conclusion</a></p> </li> </ul> <h2 id="heading-prerequisites">Prerequisites</h2> <p>Before starting, you should have:</p> <ul> <li><p>Comfortable familiarity with Dart and Flutter development</p> </li> <li><p>Understanding of REST API concepts, endpoints, HTTP methods, status codes</p> </li> <li><p>Docker Desktop installed and running</p> </li> <li><p>A Fly.io account for deployment</p> </li> </ul> <h2 id="heading-how-dart-frog-differs-from-shelf-and-serverpod">How Dart Frog Differs from Shelf and Serverpod</h2> <p>Understanding where Dart Frog sits in relation to the other two frameworks helps you make the right choice for each project.</p> <p>Shelf gives you a Router and you mount handlers manually. Your folder structure has nothing to do with your URL structure. You decide what goes where.</p> <p>Serverpod generates your routes from endpoint class names and method names. You define a class, run a generator, and the URL is derived automatically.</p> <p>Dart Frog maps your file system directly to your URL structure. A file at routes/users/index.dart becomes the /users endpoint. A file at routes/users/[id].dart becomes /users/:id. No configuration, no registration, no generation step. The file is the route.</p> <p>This model will feel immediately intuitive to Flutter engineers who have worked with Next.js or any modern web framework. It's also significantly easier to navigate in a team. You look at the folder structure and you instantly know what endpoints exist.</p> <p>The other key difference is the RequestContext. Where Shelf passes a raw Request to handlers, Dart Frog wraps it in a RequestContext that carries both the request and any values injected by middleware. This is Dart Frog's dependency injection mechanism, and it's elegant.</p> <h2 id="heading-installing-dart-frog">Installing Dart Frog</h2> <p>Install the Dart Frog CLI:</p> <pre><code class="language-bash">dart pub global activate dart_frog_cli </code></pre> <p>Verify the installation:</p> <pre><code class="language-bash">dart_frog --version </code></pre> <h2 id="heading-creating-the-project">Creating the Project</h2> <pre><code class="language-bash">dart_frog create user_profile_api cd user_profile_api </code></pre> <p>Start the development server with hot reload:</p> <pre><code class="language-bash">dart_frog dev </code></pre> <p>Visit <a href="http://localhost:8080">http://localhost:8080</a> and you'll see the default welcome response. The dev server watches for file changes and reloads automatically. No restart needed as you build.</p> <h2 id="heading-understanding-the-project-structure">Understanding the Project Structure</h2> <pre><code class="language-plaintext">user_profile_api/ routes/ index.dart ← GET / pubspec.yaml analysis_options.yaml </code></pre> <p>That's the entire starting structure. Clean and minimal. Everything we add will extend from here.</p> <p>After building our API, the full structure will look like this:</p> <pre><code class="language-plaintext">user_profile_api/ routes/ _middleware.dart ← global middleware pipeline index.dart ← GET / auth/ login.dart ← POST /auth/login register.dart ← POST /auth/register users/ index.dart ← GET /users [id].dart ← GET, PUT, DELETE /users/:id [id]/ profile.dart ← GET, POST, PUT /users/:id/profile lib/ config/ database.dart env.dart models/ user.dart profile.dart repositories/ user_repository.dart profile_repository.dart services/ auth_service.dart middleware/ auth_middleware.dart error_middleware.dart pubspec.yaml </code></pre> <p>The routes/ folder is the heart of a Dart Frog project. The lib/ folder holds all shared logic that routes import. This separation is clean and deliberate: routing concerns live in routes/, while business logic lives in lib/.</p> <h2 id="heading-dart-frog-core-concepts">Dart Frog Core Concepts</h2> <h3 id="heading-file-based-routing">File-Based Routing</h3> <p>Every .dart file in the routes/ directory is a route. The file path determines the URL path:</p> <table> <thead> <tr> <th>File</th> <th>URL</th> </tr> </thead> <tbody><tr> <td>routes/index.dart</td> <td>/</td> </tr> <tr> <td>routes/users/index.dart</td> <td>/users</td> </tr> <tr> <td>routes/users/[id].dart</td> <td>/users/:id</td> </tr> <tr> <td>routes/auth/login.dart</td> <td>/auth/login</td> </tr> <tr> <td>routes/users/[id]/profile.dart</td> <td>/users/:id/profile</td> </tr> </tbody></table> <p>Every route file must export an onRequest function:</p> <pre><code class="language-dart">import 'package:dart_frog/dart_frog.dart'; Future<Response> onRequest(RequestContext context) async { return Response.json(body: {'message': 'Hello from Dart Frog'}); } </code></pre> <p>That's the entire contract. One function, one file, one route. Dart Frog generates the internal routing glue automatically when you run dart_frog dev or dart_frog build.</p> <h3 id="heading-the-requestcontext">The RequestContext</h3> <p>RequestContext is the object passed to every route handler and middleware. It's more than just the HTTP request: it's a container for the request and any values that middleware has injected:</p> <pre><code class="language-dart">Future<Response> onRequest(RequestContext context) async { // The raw HTTP request final request = context.request; // HTTP method print(request.method); // GET, POST, etc. // Path parameters (for dynamic routes like [id].dart) final id = context.request.uri.pathSegments.last; // Query parameters final page = request.uri.queryParameters['page']; // Request body final body = await request.json() as Map<String, dynamic>; // Values injected by middleware final db = context.read<DatabaseConnection>(); final currentUser = context.read<AuthenticatedUser>(); return Response.json(body: {'ok': true}); } </code></pre> <p>context.read() is the dependency injection mechanism. Middleware provides values, and routes consume them. This keeps routes clean and testable: a route handler doesn't know how a database connection was created, it just reads it from context.</p> <h3 id="heading-middleware-and-dependency-injection">Middleware and Dependency Injection</h3> <p>A <code>_middleware.dart</code> file in any route folder applies middleware to all routes in that folder and its subfolders. A <code>_middleware.dart</code> at the root routes/ level applies globally.</p> <p>Middleware in Dart Frog uses the provider pattern to inject values into the context:</p> <pre><code class="language-dart">import 'package:dart_frog/dart_frog.dart'; Handler middleware(Handler handler) { return handler.use( provider<DatabaseConnection>( (context) => DatabaseConnection.instance, ), ); } </code></pre> <p>Any route in the same folder, or any subfolder, can then call context.read() to get the connection. No global singletons, no manual passing. The context carries it.</p> <p>Middleware functions can also intercept requests before they reach the route handler, making them perfect for authentication:</p> <pre><code class="language-dart">Handler middleware(Handler handler) { return (context) async { final authHeader = context.request.headers['authorization']; if (authHeader == null) { return Response.json( statusCode: 401, body: {'error': 'Authorization required'}, ); } // Verify token and inject user final user = verifyToken(authHeader); return handler(context.provide<AuthenticatedUser>(() => user)); }; } </code></pre> <h3 id="heading-dynamic-routes">Dynamic Routes</h3> <p>A file named [id].dart matches any single path segment. Inside the handler, extract the parameter from the URL:</p> <pre><code class="language-dart">Future<Response> onRequest(RequestContext context, String id) async { // id is automatically passed as a parameter for dynamic routes return Response.json(body: {'userId': id}); } </code></pre> <p>Dart Frog passes dynamic route parameters as additional arguments to onRequest. This is cleaner than parsing them manually from the URL.</p> <h2 id="heading-setting-up-the-database">Setting Up the Database</h2> <h3 id="heading-docker-compose-for-postgresql">Docker Compose for PostgreSQL</h3> <p>Create docker-compose.yml in the project root:</p> <pre><code class="language-yaml">version: '3.8' services: postgres: image: postgres:16-alpine container_name: user_profile_db environment: POSTGRES_DB: user_profile_api POSTGRES_USER: dart_user POSTGRES_PASSWORD: dart_password ports: - "5432:5432" volumes: - postgres_data:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U dart_user -d user_profile_api"] interval: 5s timeout: 5s retries: 5 volumes: postgres_data: </code></pre> <p>Start the database:</p> <pre><code class="language-bash">docker compose up -d </code></pre> <h3 id="heading-environment-configuration">Environment Configuration</h3> <p>Add dependencies to pubspec.yaml:</p> <pre><code class="language-yaml">dependencies: dart_frog: ^1.4.0 dart_frog_auth: ^0.1.0 postgres: ^3.3.0 dart_jsonwebtoken: ^2.12.0 bcrypt: ^1.1.3 dotenv: ^4.1.0 dev_dependencies: dart_frog_cli: ^1.2.0 test: ^1.24.0 dart_frog_test: ^0.1.0 </code></pre> <p>Run dart pub get.</p> <p>Create .env:</p> <pre><code class="language-plaintext">DB_HOST=localhost DB_PORT=5432 DB_NAME=user_profile_api DB_USER=dart_user DB_PASSWORD=dart_password JWT_SECRET=your_super_secret_key_change_this_in_production JWT_EXPIRY_HOURS=24 PORT=8080 </code></pre> <p>Create lib/config/env.dart:</p> <pre><code class="language-dart">import 'package:dotenv/dotenv.dart'; class Env { static late final DotEnv _env; static void load() { _env = DotEnv(includePlatformEnvironment: true)..load(); } static String get dbHost => _env['DB_HOST'] ?? 'localhost'; static int get dbPort => int.parse(_env['DB_PORT'] ?? '5432'); static String get dbName => _env['DB_NAME'] ?? 'user_profile_api'; static String get dbUser => _env['DB_USER'] ?? 'dart_user'; static String get dbPassword => _env['DB_PASSWORD'] ?? ''; static String get jwtSecret => _env['JWT_SECRET'] ?? ''; static int get jwtExpiryHours => int.parse(_env['JWT_EXPIRY_HOURS'] ?? '24'); } </code></pre> <h3 id="heading-database-connection-manager">Database Connection Manager</h3> <p>Create lib/config/database.dart:</p> <pre><code class="language-dart">import 'package:postgres/postgres.dart'; import 'env.dart'; class Database { static Connection? _connection; static Future<Connection> get connection async { if (_connection != null) return _connection!; _connection = await Connection.open( Endpoint( host: Env.dbHost, port: Env.dbPort, database: Env.dbName, username: Env.dbUser, password: Env.dbPassword, ), settings: const ConnectionSettings(sslMode: SslMode.disable), ); print('Database connected'); return _connection!; } static Future<void> runMigrations() async { final conn = await connection; await conn.execute(''' CREATE TABLE IF NOT EXISTS users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), email VARCHAR(255) UNIQUE NOT NULL, password_hash VARCHAR(255) NOT NULL, first_name VARCHAR(100) NOT NULL, last_name VARCHAR(100) NOT NULL, is_active BOOLEAN DEFAULT TRUE, created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() ); CREATE INDEX IF NOT EXISTS idx_users_email ON users(email); CREATE TABLE IF NOT EXISTS profiles ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, bio TEXT, avatar_url VARCHAR(500), phone VARCHAR(20), location VARCHAR(255), website VARCHAR(500), created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), UNIQUE(user_id) ); CREATE INDEX IF NOT EXISTS idx_profiles_user_id ON profiles(user_id); '''); print('Migrations applied'); } } </code></pre> <h3 id="heading-migrations">Migrations</h3> <p>Dart Frog projects have a main.dart entry point generated during dart_frog build. For the development server, migrations are best run from the project entrypoint. Create main.dart in the project root:</p> <pre><code class="language-dart">import 'dart:io'; import 'package:dart_frog/dart_frog.dart'; import 'lib/config/database.dart'; import 'lib/config/env.dart'; Future<HttpServer> run(Handler handler, InternetAddress ip, int port) async { Env.load(); await Database.runMigrations(); return serve(handler, ip, port); } </code></pre> <p>This run function is Dart Frog's server lifecycle hook. It runs before the server starts accepting requests, giving us the right place to load environment variables and run migrations.</p> <h2 id="heading-defining-the-models">Defining the Models</h2> <p>With the database layer in place, we need Dart classes to represent the data coming in and out of it.</p> <p>The User model maps to the users table and handles conversion between database rows and Dart objects. The Profile model does the same for the profiles table. Both models follow the same pattern: a factory constructor for reading from the database and a <code>toJson</code> method for sending data back to the client.</p> <p>Note that <code>toJson</code> on the User model deliberately excludes the password hash. You should never return credential data in an API response.</p> <p>Create lib/models/user.dart:</p> <pre><code class="language-dart">class User { const User({ required this.id, required this.email, required this.passwordHash, required this.firstName, required this.lastName, required this.isActive, required this.createdAt, required this.updatedAt, }); final String id; final String email; final String passwordHash; final String firstName; final String lastName; final bool isActive; final DateTime createdAt; final DateTime updatedAt; factory User.fromRow(Map<String, dynamic> row) => User( id: row['id'] as String, email: row['email'] as String, passwordHash: row['password_hash'] as String, firstName: row['first_name'] as String, lastName: row['last_name'] as String, isActive: row['is_active'] as bool, createdAt: row['created_at'] as DateTime, updatedAt: row['updated_at'] as DateTime, ); Map<String, dynamic> toJson() => { 'id': id, 'email': email, 'firstName': firstName, 'lastName': lastName, 'isActive': isActive, 'createdAt': createdAt.toIso8601String(), 'updatedAt': updatedAt.toIso8601String(), }; } </code></pre> <p>Create lib/models/profile.dart:</p> <pre><code class="language-dart">class Profile { const Profile({ required this.id, required this.userId, this.bio, this.avatarUrl, this.phone, this.location, this.website, required this.createdAt, required this.updatedAt, }); final String id; final String userId; final String? bio; final String? avatarUrl; final String? phone; final String? location; final String? website; final DateTime createdAt; final DateTime updatedAt; factory Profile.fromRow(Map<String, dynamic> row) => Profile( id: row['id'] as String, userId: row['user_id'] as String, bio: row['bio'] as String?, avatarUrl: row['avatar_url'] as String?, phone: row['phone'] as String?, location: row['location'] as String?, website: row['website'] as String?, createdAt: row['created_at'] as DateTime, updatedAt: row['updated_at'] as DateTime, ); Map<String, dynamic> toJson() => { 'id': id, 'userId': userId, 'bio': bio, 'avatarUrl': avatarUrl, 'phone': phone, 'location': location, 'website': website, 'createdAt': createdAt.toIso8601String(), 'updatedAt': updatedAt.toIso8601String(), }; } </code></pre> <h2 id="heading-building-the-repositories">Building the Repositories</h2> <p>Repositories are the single point of contact between the application and the database. Rather than writing SQL directly inside route handlers, we'll centralise all database operations here. This keeps the handlers clean and makes the data access logic easy to find, maintain, and test independently.</p> <p>The UserRepository handles every operation on the users table. The ProfileRepository does the same for profiles, using userId as its primary lookup key since profiles are always accessed in the context of a specific user.</p> <h3 id="heading-user-repository">User Repository</h3> <p>Create lib/repositories/user_repository.dart:</p> <pre><code class="language-dart">import 'package:postgres/postgres.dart'; import '../config/database.dart'; import '../models/user.dart'; class UserRepository { Future<Connection> get _conn => Database.connection; Future<List<User>> findAll() async { final conn = await _conn; final results = await conn.execute( 'SELECT * FROM users WHERE is_active = TRUE ORDER BY created_at DESC', ); return results.map((r) => User.fromRow(r.toColumnMap())).toList(); } Future<User?> findById(String id) async { final conn = await _conn; final results = await conn.execute( Sql.named('SELECT * FROM users WHERE id = @id AND is_active = TRUE'), parameters: {'id': id}, ); if (results.isEmpty) return null; return User.fromRow(results.first.toColumnMap()); } Future<User?> findByEmail(String email) async { final conn = await _conn; final results = await conn.execute( Sql.named('SELECT * FROM users WHERE email = @email'), parameters: {'email': email}, ); if (results.isEmpty) return null; return User.fromRow(results.first.toColumnMap()); } Future<User> create({ required String email, required String passwordHash, required String firstName, required String lastName, }) async { final conn = await _conn; final results = await conn.execute( Sql.named(''' INSERT INTO users (email, password_hash, first_name, last_name) VALUES (@email, @passwordHash, @firstName, @lastName) RETURNING * '''), parameters: { 'email': email, 'passwordHash': passwordHash, 'firstName': firstName, 'lastName': lastName, }, ); return User.fromRow(results.first.toColumnMap()); } Future<User?> update({ required String id, String? firstName, String? lastName, }) async { final conn = await _conn; final results = await conn.execute( Sql.named(''' UPDATE users SET first_name = COALESCE(@firstName, first_name), last_name = COALESCE(@lastName, last_name), updated_at = NOW() WHERE id = @id AND is_active = TRUE RETURNING * '''), parameters: {'id': id, 'firstName': firstName, 'lastName': lastName}, ); if (results.isEmpty) return null; return User.fromRow(results.first.toColumnMap()); } Future<bool> delete(String id) async { final conn = await _conn; final results = await conn.execute( Sql.named(''' UPDATE users SET is_active = FALSE, updated_at = NOW() WHERE id = @id AND is_active = TRUE RETURNING id '''), parameters: {'id': id}, ); return results.isNotEmpty; } } </code></pre> <h3 id="heading-profile-repository">Profile Repository</h3> <p>Create lib/repositories/profile_repository.dart:</p> <pre><code class="language-dart">import 'package:postgres/postgres.dart'; import '../config/database.dart'; import '../models/profile.dart'; class ProfileRepository { Future<Connection> get _conn => Database.connection; Future<Profile?> findByUserId(String userId) async { final conn = await _conn; final results = await conn.execute( Sql.named('SELECT * FROM profiles WHERE user_id = @userId'), parameters: {'userId': userId}, ); if (results.isEmpty) return null; return Profile.fromRow(results.first.toColumnMap()); } Future<Profile> create({ required String userId, String? bio, String? avatarUrl, String? phone, String? location, String? website, }) async { final conn = await _conn; final results = await conn.execute( Sql.named(''' INSERT INTO profiles (user_id, bio, avatar_url, phone, location, website) VALUES (@userId, @bio, @avatarUrl, @phone, @location, @website) RETURNING * '''), parameters: { 'userId': userId, 'bio': bio, 'avatarUrl': avatarUrl, 'phone': phone, 'location': location, 'website': website, }, ); return Profile.fromRow(results.first.toColumnMap()); } Future<Profile?> update({ required String userId, String? bio, String? avatarUrl, String? phone, String? location, String? website, }) async { final conn = await _conn; final results = await conn.execute( Sql.named(''' UPDATE profiles SET bio = COALESCE(@bio, bio), avatar_url = COALESCE(@avatarUrl, avatar_url), phone = COALESCE(@phone, phone), location = COALESCE(@location, location), website = COALESCE(@website, website), updated_at = NOW() WHERE user_id = @userId RETURNING * '''), parameters: { 'userId': userId, 'bio': bio, 'avatarUrl': avatarUrl, 'phone': phone, 'location': location, 'website': website, }, ); if (results.isEmpty) return null; return Profile.fromRow(results.first.toColumnMap()); } } </code></pre> <h2 id="heading-authentication-service">Authentication Service</h2> <p>Authentication in this project is handled by a dedicated AuthService that lives in lib/services/. It has one clear responsibility: the cryptographic operations that power auth: hashing passwords before storing them, verifying passwords at login, generating signed JWT tokens on success, and verifying those tokens on protected requests.</p> <p>Keeping this logic in a service rather than spreading it across route handlers means it can be injected via middleware and consumed cleanly anywhere in the app.</p> <p>Create lib/services/auth_service.dart:</p> <pre><code class="language-dart">import 'package:bcrypt/bcrypt.dart'; import 'package:dart_jsonwebtoken/dart_jsonwebtoken.dart'; import '../config/env.dart'; import '../models/user.dart'; class AuthService { String hashPassword(String password) => BCrypt.hashpw(password, BCrypt.gensalt()); bool verifyPassword(String password, String hash) => BCrypt.checkpw(password, hash); String generateToken(User user) { final jwt = JWT({ 'sub': user.id, 'email': user.email, 'iat': DateTime.now().millisecondsSinceEpoch ~/ 1000, }); return jwt.sign( SecretKey(Env.jwtSecret), expiresIn: Duration(hours: Env.jwtExpiryHours), ); } JWT? verifyToken(String token) { try { return JWT.verify(token, SecretKey(Env.jwtSecret)); } catch (_) { return null; } } } </code></pre> <h2 id="heading-middleware">Middleware</h2> <p>Middleware is where Dart Frog's dependency injection model does its most important work. Rather than instantiating repositories and services inside each route handler, we create them once in middleware and make them available to every handler downstream via the RequestContext.</p> <p>This section defines three pieces of middleware: the database middleware that injects the repositories and auth service, the auth middleware that validates JWT tokens and protects routes, and the error middleware that catches unhandled exceptions and returns consistent error responses across the entire API.</p> <h3 id="heading-database-middleware">Database Middleware</h3> <p>Create lib/middleware/database_middleware.dart:</p> <pre><code class="language-dart">import 'package:dart_frog/dart_frog.dart'; import '../repositories/user_repository.dart'; import '../repositories/profile_repository.dart'; import '../services/auth_service.dart'; Middleware databaseMiddleware() { return (handler) { return handler .use(provider<UserRepository>((_) => UserRepository())) .use(provider<ProfileRepository>((_) => ProfileRepository())) .use(provider<AuthService>((_) => AuthService())); }; } </code></pre> <p>This middleware injects the repositories and auth service into every request context. Routes read them with <code>context.read()</code> without caring how they were created.</p> <h3 id="heading-auth-middleware">Auth Middleware</h3> <p>Create lib/middleware/auth_middleware.dart:</p> <pre><code class="language-dart">import 'dart:convert'; import 'package:dart_frog/dart_frog.dart'; import '../services/auth_service.dart'; Middleware authMiddleware() { return (handler) { return (context) async { final authHeader = context.request.headers['authorization']; if (authHeader == null || !authHeader.startsWith('Bearer ')) { return Response.json( statusCode: 401, body: {'error': 'Authorization header missing or malformed'}, ); } final token = authHeader.substring(7); final authService = context.read<AuthService>(); final jwt = authService.verifyToken(token); if (jwt == null) { return Response.json( statusCode: 401, body: {'error': 'Invalid or expired token'}, ); } final userId = jwt.payload['sub'] as String; final userEmail = jwt.payload['email'] as String; return handler( context.provide<Map<String, String>>( () => {'userId': userId, 'userEmail': userEmail}, ), ); }; }; } </code></pre> <h3 id="heading-error-middleware">Error Middleware</h3> <p>Create lib/middleware/error_middleware.dart:</p> <pre><code class="language-dart">import 'package:dart_frog/dart_frog.dart'; Middleware errorMiddleware() { return (handler) { return (context) async { try { return await handler(context); } on FormatException catch (e) { return Response.json( statusCode: 400, body: {'error': 'Invalid request body: ${e.message}'}, ); } catch (e, stackTrace) { print('Unhandled error: $e\n$stackTrace'); return Response.json( statusCode: 500, body: {'error': 'An internal server error occurred'}, ); } }; }; } </code></pre> <h2 id="heading-building-the-routes">Building the Routes</h2> <p>With the models, repositories, auth service, and middleware all in place, we can now build the route handlers.</p> <p>In Dart Frog, each file in the routes/ folder is a self-contained endpoint. Routes don't manage dependencies directly. Instead, they read what middleware has already injected into the context and call the appropriate repository or service method.</p> <p>This section covers three groups of routes: the auth routes for registration and login, the user routes for CRUD operations, and the profile routes nested under a user's ID.</p> <h3 id="heading-auth-routes">Auth Routes</h3> <p>Create routes/auth/register.dart:</p> <pre><code class="language-dart">import 'package:dart_frog/dart_frog.dart'; import '../../lib/repositories/user_repository.dart'; import '../../lib/services/auth_service.dart'; Future<Response> onRequest(RequestContext context) async { if (context.request.method != HttpMethod.post) { return Response.json(statusCode: 405, body: {'error': 'Method not allowed'}); } final body = await context.request.json() as Map<String, dynamic>; final email = body['email'] as String?; final password = body['password'] as String?; final firstName = body['firstName'] as String?; final lastName = body['lastName'] as String?; if (email == null || password == null || firstName == null || lastName == null) { return Response.json( statusCode: 400, body: {'error': 'email, password, firstName, and lastName are required'}, ); } if (password.length < 8) { return Response.json( statusCode: 400, body: {'error': 'Password must be at least 8 characters'}, ); } final userRepo = context.read<UserRepository>(); final authService = context.read<AuthService>(); final existing = await userRepo.findByEmail(email); if (existing != null) { return Response.json( statusCode: 409, body: {'error': 'An account with this email already exists'}, ); } final user = await userRepo.create( email: email, passwordHash: authService.hashPassword(password), firstName: firstName, lastName: lastName, ); return Response.json( statusCode: 201, body: { 'user': user.toJson(), 'token': authService.generateToken(user), }, ); } </code></pre> <p>Create routes/auth/login.dart:</p> <pre><code class="language-dart">import 'package:dart_frog/dart_frog.dart'; import '../../lib/repositories/user_repository.dart'; import '../../lib/services/auth_service.dart'; Future<Response> onRequest(RequestContext context) async { if (context.request.method != HttpMethod.post) { return Response.json(statusCode: 405, body: {'error': 'Method not allowed'}); } final body = await context.request.json() as Map<String, dynamic>; final email = body['email'] as String?; final password = body['password'] as String?; if (email == null || password == null) { return Response.json( statusCode: 400, body: {'error': 'email and password are required'}, ); } final userRepo = context.read<UserRepository>(); final authService = context.read<AuthService>(); final user = await userRepo.findByEmail(email); if (user == null || !authService.verifyPassword(password, user.passwordHash)) { return Response.json( statusCode: 401, body: {'error': 'Invalid email or password'}, ); } return Response.json( body: { 'user': user.toJson(), 'token': authService.generateToken(user), }, ); } </code></pre> <h3 id="heading-user-routes">User Routes</h3> <p>Create routes/users/index.dart:</p> <pre><code class="language-dart">import 'package:dart_frog/dart_frog.dart'; import '../../lib/repositories/user_repository.dart'; Future<Response> onRequest(RequestContext context) async { if (context.request.method != HttpMethod.get) { return Response.json(statusCode: 405, body: {'error': 'Method not allowed'}); } final userRepo = context.read<UserRepository>(); final users = await userRepo.findAll(); return Response.json( body: users.map((u) => u.toJson()).toList(), ); } </code></pre> <p>Create routes/users/[id].dart:</p> <pre><code class="language-dart">import 'package:dart_frog/dart_frog.dart'; import '../../lib/repositories/user_repository.dart'; Future<Response> onRequest(RequestContext context, String id) async { final userRepo = context.read<UserRepository>(); switch (context.request.method) { case HttpMethod.get: return _getUser(userRepo, id); case HttpMethod.put: return _updateUser(context, userRepo, id); case HttpMethod.delete: return _deleteUser(userRepo, id); default: return Response.json( statusCode: 405, body: {'error': 'Method not allowed'}, ); } } Future<Response> _getUser(UserRepository repo, String id) async { final user = await repo.findById(id); if (user == null) { return Response.json(statusCode: 404, body: {'error': 'User not found'}); } return Response.json(body: user.toJson()); } Future<Response> _updateUser( RequestContext context, UserRepository repo, String id, ) async { final body = await context.request.json() as Map<String, dynamic>; final user = await repo.update( id: id, firstName: body['firstName'] as String?, lastName: body['lastName'] as String?, ); if (user == null) { return Response.json(statusCode: 404, body: {'error': 'User not found'}); } return Response.json(body: user.toJson()); } Future<Response> _deleteUser(UserRepository repo, String id) async { final deleted = await repo.delete(id); if (!deleted) { return Response.json(statusCode: 404, body: {'error': 'User not found'}); } return Response.json(statusCode: 204, body: null); } </code></pre> <p>Notice how onRequest receives String id as a second parameter, Dart Frog automatically passes the dynamic path segment to the handler. The switch on context.request.method handles all HTTP methods in a single file which is the idiomatic Dart Frog pattern for CRUD endpoints.</p> <h3 id="heading-profile-routes">Profile Routes</h3> <p>Create routes/users/[id]/profile.dart:</p> <pre><code class="language-dart">import 'package:dart_frog/dart_frog.dart'; import '../../../lib/repositories/user_repository.dart'; import '../../../lib/repositories/profile_repository.dart'; Future<Response> onRequest(RequestContext context, String id) async { final userRepo = context.read<UserRepository>(); final profileRepo = context.read<ProfileRepository>(); final user = await userRepo.findById(id); if (user == null) { return Response.json(statusCode: 404, body: {'error': 'User not found'}); } switch (context.request.method) { case HttpMethod.get: return _getProfile(profileRepo, id); case HttpMethod.post: return _createProfile(context, profileRepo, id); case HttpMethod.put: return _updateProfile(context, profileRepo, id); default: return Response.json( statusCode: 405, body: {'error': 'Method not allowed'}, ); } } Future<Response> _getProfile(ProfileRepository repo, String userId) async { final profile = await repo.findByUserId(userId); if (profile == null) { return Response.json(statusCode: 404, body: {'error': 'Profile not found'}); } return Response.json(body: profile.toJson()); } Future<Response> _createProfile( RequestContext context, ProfileRepository repo, String userId, ) async { final existing = await repo.findByUserId(userId); if (existing != null) { return Response.json( statusCode: 409, body: {'error': 'Profile already exists for this user'}, ); } final body = await context.request.json() as Map<String, dynamic>; final profile = await repo.create( userId: userId, bio: body['bio'] as String?, avatarUrl: body['avatarUrl'] as String?, phone: body['phone'] as String?, location: body['location'] as String?, website: body['website'] as String?, ); return Response.json(statusCode: 201, body: profile.toJson()); } Future<Response> _updateProfile( RequestContext context, ProfileRepository repo, String userId, ) async { final body = await context.request.json() as Map<String, dynamic>; final profile = await repo.update( userId: userId, bio: body['bio'] as String?, avatarUrl: body['avatarUrl'] as String?, phone: body['phone'] as String?, location: body['location'] as String?, website: body['website'] as String?, ); if (profile == null) { return Response.json(statusCode: 404, body: {'error': 'Profile not found'}); } return Response.json(body: profile.toJson()); } </code></pre> <h2 id="heading-wiring-the-middleware-pipeline">Wiring the Middleware Pipeline</h2> <p>The routes and middleware are all written, but they aren't connected yet. In Dart Frog, the connection happens through <code>_middleware.dart</code> files placed strategically in the routes/ folder.</p> <p>To review, a <code>_middleware.dart</code> file at the root level applies to every route in the project. A <code>_middleware.dart</code> inside a subfolder applies only to routes in that folder and below. This gives us precise, folder-scoped control over which middleware runs where without any manual registration or mounting.</p> <p>Create <code>routes/_middleware.dart</code> for global middleware applied to every route:</p> <pre><code class="language-dart">import 'package:dart_frog/dart_frog.dart'; import '../lib/middleware/database_middleware.dart'; import '../lib/middleware/error_middleware.dart'; Handler middleware(Handler handler) { return handler .use(databaseMiddleware()) .use(errorMiddleware()); } </code></pre> <p>Create <code>routes/users/_middleware.dart</code> to protect all user routes with authentication:</p> <pre><code class="language-dart">import 'package:dart_frog/dart_frog.dart'; import '../../lib/middleware/auth_middleware.dart'; Handler middleware(Handler handler) { return handler.use(authMiddleware()); } </code></pre> <p>This is one of the most elegant parts of Dart Frog's model. The routes/users/_middleware.dart file automatically applies auth to every route under routes/users/, including routes/users/index.dart, routes/users/[id].dart, and routes/users/[id]/profile.dart. The auth routes under routes/auth/ are untouched because they live outside the users/ folder.</p> <p>There's no manual middleware mounting, no array of protected routes, and no route group configuration. The folder structure does the work.</p> <h2 id="heading-testing-the-api">Testing the API</h2> <p>With the server running and all routes wired up, we can verify the full flow end to end. Start the development server and run through each endpoint in order: register a user first to get a token, then use that token on the protected routes. Replace {userId} in the commands below with the actual ID returned from the register response.</p> <p>Start the development server:</p> <pre><code class="language-bash">dart_frog dev # Server is now running at: http://localhost:8080 </code></pre> <p>Register a user:</p> <pre><code class="language-bash">curl http://localhost:8080/auth/register \ -X POST \ -H "Content-Type: application/json" \ -d '{ "email": "seyi@example.com", "password": "securepassword", "firstName": "Seyi", "lastName": "Dev" }' </code></pre> <p>Response:</p> <pre><code class="language-json">{ "user": { "id": "uuid-here", "email": "seyi@example.com", "firstName": "Seyi", "lastName": "Dev", "isActive": true, "createdAt": "2025-01-01T00:00:00.000Z", "updatedAt": "2025-01-01T00:00:00.000Z" }, "token": "eyJhbGci..." } </code></pre> <p>Login:</p> <pre><code class="language-bash">curl http://localhost:8080/auth/login \ -X POST \ -H "Content-Type: application/json" \ -d '{"email": "seyi@example.com", "password": "securepassword"}' </code></pre> <p>Get all users:</p> <pre><code class="language-bash">curl http://localhost:8080/users \ -H "Authorization: Bearer eyJhbGci..." </code></pre> <p>Get a specific user:</p> <pre><code class="language-bash">curl http://localhost:8080/users/{userId} \ -H "Authorization: Bearer eyJhbGci..." </code></pre> <p>Create a profile:</p> <pre><code class="language-bash">curl http://localhost:8080/users/{userId}/profile \ -X POST \ -H "Authorization: Bearer eyJhbGci..." \ -H "Content-Type: application/json" \ -d '{ "bio": "Flutter engineer turned backend developer", "location": "Lagos, Nigeria", "website": "https://example.com" }' </code></pre> <p>Update a user:</p> <pre><code class="language-bash">curl http://localhost:8080/users/{userId} \ -X PUT \ -H "Authorization: Bearer eyJhbGci..." \ -H "Content-Type: application/json" \ -d '{"firstName": "Oluwaseyi"}' </code></pre> <p>Delete a user:</p> <pre><code class="language-bash">curl http://localhost:8080/users/{userId} \ -X DELETE \ -H "Authorization: Bearer eyJhbGci..." </code></pre> <h2 id="heading-deployment">Deployment</h2> <p>With everything tested locally, the final step is getting the API live. Dart Frog makes this straightforward: a single CLI command generates a production-ready Dockerfile, and from there we deploy to Fly.io where the app will run as a containerized service alongside a managed PostgreSQL database.</p> <h3 id="heading-production-build">Production Build</h3> <p>Dart Frog generates a production-ready Docker setup with a single command:</p> <pre><code class="language-bash">dart_frog build </code></pre> <p>This creates a build/ directory containing:</p> <pre><code class="language-plaintext">build/ bin/ server.dart ← compiled entry point Dockerfile ← production Dockerfile pubspec.yaml pubspec.lock </code></pre> <p>The generated Dockerfile is a multi-stage build, compiles to a native binary in the first stage, runs from a minimal Debian image in the second. You do not need to write this yourself.</p> <h3 id="heading-deploying-to-flyio">Deploying to Fly.io</h3> <p><strong>Step 1 — Authenticate:</strong></p> <pre><code class="language-bash">fly auth login </code></pre> <p><strong>Step 2 — Launch from the build directory:</strong></p> <pre><code class="language-bash">cd build fly launch </code></pre> <p>Fly detects the Dockerfile and prompts for configuration. Create a PostgreSQL database when asked.</p> <p><strong>Step 3 — Set secrets:</strong></p> <pre><code class="language-bash">fly secrets set JWT_SECRET="your_production_jwt_secret" fly secrets set JWT_EXPIRY_HOURS="24" </code></pre> <p><strong>Step 4 — Deploy:</strong></p> <pre><code class="language-bash">fly deploy </code></pre> <p><strong>Step 5 — Verify:</strong></p> <pre><code class="language-bash">curl https://your-app-name.fly.dev/auth/register \ -X POST \ -H "Content-Type: application/json" \ -d '{"email":"test@example.com","password":"password123","firstName":"Seyi","lastName":"Dev"}' </code></pre> <h2 id="heading-conclusion">Conclusion</h2> <p>Dart Frog sits exactly where it positions itself: between the raw control of Shelf and the full opinions of Serverpod. It takes the file-based routing model that has proven itself in the JavaScript ecosystem and brings it to Dart cleanly, without compromising on the language's strengths.</p> <p>The routing model is its strongest feature. Looking at the routes/ folder tells you everything about your API: what endpoints exist, how they are grouped, and which middleware applies to which sections. That transparency makes codebases easier to navigate, easier to onboard into, and easier to reason about as they grow.</p> <p>The RequestContext and the provider pattern for dependency injection are well thought out. Middleware injects, routes consume, and nothing bleeds between the two. The folder-scoped middleware is particularly clean, protecting an entire section of your API is as simple as dropping a _middleware.dart file in the right folder.</p> <p>For Flutter engineers building APIs that need to serve multiple client types, conform to standard REST conventions, or integrate cleanly with existing frontend infrastructure, Dart Frog hits a practical sweet spot that neither Shelf nor Serverpod reaches as naturally.</p> <p>Dart is now a full-stack language in the truest sense. The same team, the same language, the same conventions – from the Flutter app to the server that powers it.</p> <p>Happy Coding!</p> </article> <article> <h1> How to Scale Laravel Applications for High-Traffic Production Systems </h1> <p>Olamilekan Lamidi — Thu, 11 Jun 2026 23:45:39 +0000</p> <p>Your first scaling problem rarely arrives with a bang. For a while, everything is fine: pages load fast, the database barely breaks a sweat, and the team ships features without thinking much about infrastructure.</p> <p>Then traffic climbs. A campaign over-performs. A marketplace onboards a popular seller. A SaaS product signs a couple of enterprise accounts.</p> <p>Suddenly, <code>/dashboard</code> takes two seconds instead of 300 milliseconds. Queue jobs that used to clear in seconds sit waiting for minutes. You have database CPU spikes every afternoon.</p> <p>So you add another app server, and response time barely moves because the real culprit was a slow query on a large table all along.</p> <p>If you have run Laravel in production, you've probably lived some version of this. The good news is that scaling Laravel almost never means abandoning the framework. It means learning where pressure builds and making the application behave predictably under load.</p> <p>In this guide, you'll learn how to find common bottlenecks, tune the database, use Redis effectively, move slow work onto queues, optimize APIs, and monitor a Laravel application in production.</p> <p>None of this requires a single heroic rewrite. The biggest wins usually come from practical work: removing inefficient queries, pushing slow tasks onto queues, adding the right indexes, caching carefully chosen data, and measuring whether each change actually helped.</p> <h2 id="heading-prerequisites">Prerequisites</h2> <p>You'll get the most out of this guide if you're already comfortable with:</p> <ul> <li><p>Building applications with Laravel and PHP</p> </li> <li><p>Writing Eloquent queries and database migrations</p> </li> <li><p>Using queues, jobs, and scheduled commands</p> </li> <li><p>Reading a basic database query plan</p> </li> <li><p>Deploying Laravel to a production server or platform</p> </li> <li><p>Working with Redis and either MySQL or PostgreSQL in a production-like setup</p> </li> </ul> <h2 id="heading-table-of-contents">Table of Contents</h2> <ul> <li><p><a href="#heading-what-happens-when-laravel-apps-start-growing">What Happens When Laravel Apps Start Growing</a></p> </li> <li><p><a href="#heading-common-laravel-bottlenecks">Common Laravel Bottlenecks</a></p> </li> <li><p><a href="#heading-how-to-optimize-the-database">How to Optimize the Database</a></p> </li> <li><p><a href="#heading-how-to-scale-with-redis">How to Scale with Redis</a></p> </li> <li><p><a href="#heading-how-to-use-queue-driven-architectures">How to Use Queue-Driven Architectures</a></p> </li> <li><p><a href="#heading-how-to-optimize-api-performance">How to Optimize API Performance</a></p> </li> <li><p><a href="#heading-how-to-monitor-laravel-in-production">How to Monitor Laravel in Production</a></p> </li> <li><p><a href="#heading-an-example-high-traffic-laravel-architecture">An Example High-Traffic Laravel Architecture</a></p> </li> <li><p><a href="#heading-lessons-learned-the-hard-way">Lessons Learned the Hard Way</a></p> </li> <li><p><a href="#heading-a-pre-launch-scaling-checklist">A Pre-Launch Scaling Checklist</a></p> </li> <li><p><a href="#heading-conclusion">Conclusion</a></p> </li> <li><p><a href="#heading-references">References</a></p> </li> </ul> <h2 id="heading-what-happens-when-laravel-apps-start-growing">What Happens When Laravel Apps Start Growing</h2> <p>Traffic changes a system's behavior because it turns small inefficiencies into permanent costs. A query that takes 80 milliseconds is harmless when it runs a few hundred times an hour. Run it 30 times per page view on a page that gets thousands of hits a minute, and that same query becomes a capacity problem.</p> <p>The pressure tends to show up in predictable places. More requests mean more PHP workers, more database connections, more queue volume, and more Redis operations.</p> <p>The database, whether MySQL or PostgreSQL, is usually the first thing to buckle. Queues back up when work is created faster than workers can drain it. Caches only help when hit rates stay high and misses stay controlled. And scaling everything horizontally can turn sloppy code into an expensive cloud bill.</p> <p>That's why scaling work has to start with measurement, not guesswork. Before you change anything, you want to know what is actually saturated: request CPU, database I/O, lock contention, Redis latency, queue depth, an external API, or oversized payloads.</p> <p>A typical request in a growing Laravel app travels through several layers. The user sends a request, a load balancer routes it to an app server, and Laravel checks Redis for a cached result. On a miss, it queries the database, stores the computed result back in Redis, and hands any slow follow-up work to a queue. A worker picks up that job later while Laravel returns the response right away.</p> <p>Here's the important part: adding more app servers does nothing for a slow query, a missing index, or an overloaded queue. Horizontal scaling only pays off once the shared dependencies behind those servers can keep up.</p> <h2 id="heading-common-laravel-bottlenecks">Common Laravel Bottlenecks</h2> <p>Laravel itself causes very few scaling problems. Most issues come from how application code talks to the database, the network, and background workers.</p> <h3 id="heading-n1-queries">N+1 Queries</h3> <p>The classic offender is the N+1 query. You load a list of models, then lazily touch a relationship on each one:</p> <pre><code class="language-php">use App\Models\Post; $posts = Post::latest()->take(50)->get(); foreach ($posts as $post) { echo $post->author->name; } </code></pre> <p>That's one query for the posts plus one query per author: 51 queries for a single page. Eager load the relationship instead:</p> <pre><code class="language-php">use App\Models\Post; $posts = Post::with('author') ->latest() ->take(50) ->get(); foreach ($posts as $post) { echo $post->author->name; } </code></pre> <p>In production, these are sneaky. They often hide inside API Resources, Blade components, and authorization checks, where the relationship access isn't obvious from the controller.</p> <h3 id="heading-missing-indexes">Missing Indexes</h3> <p>Adding an index is one of the highest-return fixes you can make. Take a query like this:</p> <pre><code class="language-php">$orders = Order::where('account_id', $accountId) ->where('status', 'paid') ->whereBetween('created_at', [$start, $end]) ->latest() ->paginate(50); </code></pre> <p>If <code>orders</code> has millions of rows and no useful compound index, the database scans far more rows than it needs to. Add an index that matches how you actually query:</p> <pre><code class="language-php">use Illuminate\Database\Migrations\Migration; use Illuminate\Database\Schema\Blueprint; use Illuminate\Support\Facades\Schema; return new class extends Migration { public function up(): void { Schema::table('orders', function (Blueprint $table) { $table->index(['account_id', 'status', 'created_at']); }); } public function down(): void { Schema::table('orders', function (Blueprint $table) { $table->dropIndex(['account_id', 'status', 'created_at']); }); } }; </code></pre> <p>Indexes aren't free, though. They take up space and slow down writes. Add them for real, repeated query patterns, not for every column that ever appears in a <code>where</code> clause.</p> <h3 id="heading-inefficient-eager-loading">Inefficient Eager Loading</h3> <p>You can also swing too far the other way. Loading every relationship "just in case" burns memory and ships data the request never uses:</p> <pre><code class="language-php">$users = User::with([ 'profile', 'teams', 'roles.permissions', 'invoices.lineItems.product', ])->get(); </code></pre> <p>That might be fine for an admin detail page showing one user. On a list page, it's a liability. Constrain the eager loads and select only the columns you need:</p> <pre><code class="language-php">$users = User::query() ->select(['id', 'name', 'email']) ->with([ 'profile:id,user_id,avatar_url', 'teams:id,name', ]) ->latest() ->paginate(25); </code></pre> <p>One caveat: tightly scoped select lists can break later code that expects a column you didn't load. Keep this technique close to read-heavy endpoints where the payoff is obvious.</p> <h3 id="heading-synchronous-processing">Synchronous Processing</h3> <p>High-traffic apps need short web requests. Sending email, generating PDFs, calling third-party APIs, resizing images, and building exports usually belong outside the request cycle. This version can hurt you:</p> <pre><code class="language-php">public function store(Request $request) { $order = Order::create($request->validated()); Mail::to($order->user)->send(new OrderReceipt($order)); return response()->json($order, 201); } </code></pre> <p>Push the work onto a queue instead:</p> <pre><code class="language-php">public function store(StoreOrderRequest $request) { $order = Order::create($request->validated()); SendOrderReceipt::dispatch($order->id); return response()->json([ 'id' => $order->id, 'status' => 'accepted', ], 202); } </code></pre> <p>Now your response time no longer depends on your mail provider. If the provider has a slow afternoon, the queue absorbs it and your users don't have to wait.</p> <h3 id="heading-large-payloads">Large Payloads</h3> <p>Oversized JSON responses hurt everyone in the chain: the app server serializing them, the network carrying them, and the client parsing them. A frequent mistake is returning whole models when you meant to return a summary:</p> <pre><code class="language-php">return User::with('orders', 'invoices', 'teams')->findOrFail($id); </code></pre> <p>Define an explicit API Resource instead:</p> <pre><code class="language-php">use Illuminate\Http\Resources\Json\JsonResource; class UserSummaryResource extends JsonResource { public function toArray($request): array { return [ 'id' => $this->id, 'name' => $this->name, 'avatar_url' => $this->profile?->avatar_url, 'plan' => $this->subscription_plan, ]; } } </code></pre> <p>A small, deliberate response contract keeps endpoint cost easy to reason about and prevents accidental coupling.</p> <h3 id="heading-expensive-joins">Expensive Joins</h3> <p>Joins are useful, but expensive joins across large tables can dominate your database time, especially when they sort or filter on columns that aren't indexed:</p> <pre><code class="language-php">$rows = DB::table('orders') ->join('users', 'users.id', '=', 'orders.user_id') ->join('accounts', 'accounts.id', '=', 'users.account_id') ->where('accounts.region', 'us-east') ->where('orders.status', 'paid') ->orderByDesc('orders.created_at') ->limit(100) ->get(); </code></pre> <p>At scale, you may need to denormalize a small field, precompute a reporting table, or move analytics off the primary transactional database entirely. Do not treat denormalization as an admission of defeat. Copying a stable field like <code>account_id</code> onto <code>orders</code> can remove a costly join from a hot path. The price you pay is keeping that duplicated data consistent, which can be a worthwhile trade-off.</p> <h2 id="heading-how-to-optimize-the-database">How to Optimize the Database</h2> <p>When a Laravel app slows down, the database is usually the first place to look.</p> <h3 id="heading-add-indexes-around-real-query-patterns">Add Indexes Around Real Query Patterns</h3> <p>Start with your slow query log, database metrics, and traces rather than intuition. If the app constantly looks up active subscriptions by account, build a compound index that matches that access pattern:</p> <pre><code class="language-php">Schema::table('subscriptions', function (Blueprint $table) { $table->index(['account_id', 'status', 'renews_at']); }); </code></pre> <p>Then write the query so it can actually use the index:</p> <pre><code class="language-php">$subscription = Subscription::where('account_id', $accountId) ->where('status', 'active') ->where('renews_at', '>=', now()) ->orderBy('renews_at') ->first(); </code></pre> <p>Get in the habit of running <code>EXPLAIN</code> after you add an index to confirm that the plan changed. An index the optimizer ignores is just write overhead.</p> <h3 id="heading-use-eager-loading-deliberately">Use Eager Loading Deliberately</h3> <p>Match eager loading to what the endpoint actually returns. For list endpoints, keep relationships shallow and constrained:</p> <pre><code class="language-php">$projects = Project::query() ->select(['id', 'account_id', 'name', 'updated_at']) ->withCount('openTasks') ->with([ 'owner:id,name', ]) ->where('account_id', $accountId) ->latest('updated_at') ->paginate(30); </code></pre> <p>When you only need a number, <code>withCount</code> beats loading a whole relationship to count it:</p> <pre><code class="language-php">$teams = Team::query() ->withCount([ 'members', 'invitations as pending_invitations_count' => fn ($query) => $query->whereNull('accepted_at'), ]) ->paginate(25); </code></pre> <p>Your memory footprint stays flat, which matters much more on a list page than on a detail page.</p> <h3 id="heading-optimize-queries-before-adding-hardware">Optimize Queries Before Adding Hardware</h3> <p>A bigger database instance buys you time. It also hides the inefficient queries that put you there until the next traffic jump exposes them again. Before you reach for a larger machine, find your highest-cost queries. In local or staging environments, logging slow ones is easy:</p> <pre><code class="language-php">use Illuminate\Database\Events\QueryExecuted; use Illuminate\Support\Facades\DB; use Illuminate\Support\Facades\Log; DB::listen(function (QueryExecuted $query) { if ($query->time > 100) { Log::warning('Slow query detected', [ 'sql' => $query->toRawSql(), 'time_ms' => $query->time, ]); } }); </code></pre> <p>Be careful doing this in production. Bindings can contain sensitive data, and verbose logging at high volume can become its own performance problem.</p> <h3 id="heading-process-large-tables-with-chunking">Process Large Tables with Chunking</h3> <p>Never pull an entire large table into memory for a batch job:</p> <pre><code class="language-php">User::where('is_active', true) ->chunkById(1000, function ($users) { foreach ($users as $user) { RefreshUserSearchIndex::dispatch($user->id); } }); </code></pre> <p><code>chunkById</code> is safer than offset-based chunking when rows can change while the job runs, because it tracks the last seen ID instead of a numeric offset. For very large exports, stream the records or write them out in batches.</p> <h3 id="heading-use-cursor-pagination-for-high-volume-feeds">Use Cursor Pagination for High-Volume Feeds</h3> <p>Offset pagination gets slower the deeper a user scrolls, because the database still has to skip every row it's not returning. For feeds, audit logs, messages, and timelines, cursor pagination is usually the better fit:</p> <pre><code class="language-php">$events = AuditEvent::query() ->where('account_id', $accountId) ->orderByDesc('id') ->cursorPaginate(50); return AuditEventResource::collection($events); </code></pre> <p>It relies on a stable, indexed ordering column and uses next/previous cursors rather than arbitrary page numbers, which is what an infinite-scroll feed usually needs.</p> <h3 id="heading-split-reads-with-read-replicas">Split Reads with Read Replicas</h3> <p>As read traffic grows, replicas can take load off the primary:</p> <pre><code class="language-php">'mysql' => [ 'driver' => 'mysql', 'read' => [ 'host' => [ env('DB_READ_HOST', '127.0.0.1'), ], ], 'write' => [ 'host' => [ env('DB_WRITE_HOST', '127.0.0.1'), ], ], 'sticky' => true, 'database' => env('DB_DATABASE', 'laravel'), 'username' => env('DB_USERNAME', 'root'), 'password' => env('DB_PASSWORD', ''), ], </code></pre> <p>The <code>sticky</code> option keeps reads on the write connection after a write within the same request, which helps avoid some read-after-write surprises.</p> <p>Replicas come with replication lag, and that lag matters. Don't route payment confirmations, password changes, permission checks, or anything else consistency-sensitive to a replica that might be a few seconds stale unless the business flow can genuinely tolerate seeing old data.</p> <h2 id="heading-how-to-scale-with-redis">How to Scale with Redis</h2> <p>Redis often does a lot in a Laravel production stack: caching, sessions, rate limiting, queues, locks, and Horizon metrics. It's fast, but it still needs thought: sensible key design, expiration policies, memory monitoring, and a real plan for invalidation.</p> <h3 id="heading-caching">Caching</h3> <p>Cache expensive reads that get requested often and can tolerate being slightly out of date:</p> <pre><code class="language-php">use Illuminate\Support\Facades\Cache; $stats = Cache::remember( "accounts:{$account->id}:dashboard-stats", now()->addMinutes(5), fn () => DashboardStats::forAccount($account)->calculate() ); </code></pre> <p>Short time-to-live values go a surprisingly long way. A five-minute cache can wipe out thousands of duplicate queries while keeping the data fresh enough for most dashboards.</p> <p>When the data changes after a known event, invalidate it explicitly:</p> <pre><code class="language-php">Order::created(function (Order $order) { Cache::forget("accounts:{$order->account_id}:dashboard-stats"); }); </code></pre> <p>Caching works best when your keys are predictable and your invalidation is tied to domain events rather than guesswork.</p> <h3 id="heading-sessions">Sessions</h3> <p>For horizontally scaled app servers, file-based sessions are a trap: the next request can land on a different server that has never seen the session. Store sessions in Redis or a database so any server can handle any request:</p> <pre><code class="language-env">SESSION_DRIVER=redis CACHE_STORE=redis QUEUE_CONNECTION=redis </code></pre> <h3 id="heading-rate-limiting">Rate Limiting</h3> <p>Rate limits protect you from abusive clients, runaway loops, and endpoints that get hammered:</p> <pre><code class="language-php">use Illuminate\Cache\RateLimiting\Limit; use Illuminate\Http\Request; use Illuminate\Support\Facades\RateLimiter; RateLimiter::for('api', function (Request $request) { return Limit::perMinute(120)->by( optional($request->user())->id ?: $request->ip() ); }); </code></pre> <p>Expensive endpoints deserve stricter limits:</p> <pre><code class="language-php">RateLimiter::for('exports', function (Request $request) { return Limit::perHour(10)->by($request->user()->id); }); </code></pre> <p>Let business cost drive the numbers. Login, search, export, and webhook endpoints rarely need the same limit.</p> <h3 id="heading-queues">Queues</h3> <p>Redis is a common queue backend because it's quick and Horizon supports it well:</p> <pre><code class="language-env">QUEUE_CONNECTION=redis </code></pre> <p>Dispatch work onto named queues from the request:</p> <pre><code class="language-php">GenerateInvoicePdf::dispatch($invoice->id) ->onQueue('documents'); </code></pre> <p>Split work by profile, such as <code>default</code>, <code>emails</code>, <code>webhooks</code>, <code>documents</code>, and <code>imports</code>, because each workload can need different worker counts and retry rules. Keep the names meaningful. During an incident, "the documents queue is 20 minutes behind" tells you far more than "default is slow."</p> <h2 id="heading-how-to-use-queue-driven-architectures">How to Use Queue-Driven Architectures</h2> <p>Queues are one of Laravel's best scaling tools. They let the app accept work quickly and process it asynchronously with controlled concurrency. They also make the system more resilient: when a third-party API goes down, jobs retry on their own instead of tying up your PHP-FPM request workers.</p> <h3 id="heading-laravel-queues">Laravel Queues</h3> <p>A good job is small, idempotent, and safe to retry:</p> <pre><code class="language-php">use App\Mail\OrderReceiptMail; use App\Models\Order; use Illuminate\Contracts\Queue\ShouldQueue; use Illuminate\Foundation\Queue\Queueable; use Illuminate\Support\Facades\Mail; class SendOrderReceipt implements ShouldQueue { use Queueable; public int $tries = 3; public int $backoff = 60; public function __construct(public int $orderId) { } public function handle(): void { $order = Order::with('user')->findOrFail($this->orderId); Mail::to($order->user)->send(new OrderReceiptMail($order)); } } </code></pre> <p>Pass IDs into jobs rather than full Eloquent models. The model might change before the job runs, and serializing a whole model bloats the payload. For external APIs, add timeouts and guard against duplicate work:</p> <pre><code class="language-php">use App\Models\Order; use App\Services\CrmClient; use Illuminate\Contracts\Queue\ShouldQueue; use Illuminate\Foundation\Queue\Queueable; class SyncOrderToCrm implements ShouldQueue { use Queueable; public int $tries = 3; public int $backoff = 60; public function __construct(public int $orderId) { } public function handle(CrmClient $crm): void { $order = Order::findOrFail($this->orderId); if ($order->crm_synced_at) { return; } $crm->upsertOrder($order->external_reference, [ 'total' => $order->total, 'status' => $order->status, ]); $order->forceFill(['crm_synced_at' => now()])->save(); } } </code></pre> <p>The <code>crm_synced_at</code> check is the whole point. Jobs run more than once in real life, and idempotency is what keeps a retry from double-charging or double-syncing.</p> <h3 id="heading-horizon">Horizon</h3> <p>Horizon gives you visibility and control over Redis queues. A typical setup runs different supervisors for different workloads:</p> <pre><code class="language-php">'production' => [ 'supervisor-default' => [ 'connection' => 'redis', 'queue' => ['default', 'emails'], 'balance' => 'auto', 'maxProcesses' => 20, 'tries' => 3, ], 'supervisor-documents' => [ 'connection' => 'redis', 'queue' => ['documents'], 'balance' => 'simple', 'maxProcesses' => 5, 'tries' => 2, 'timeout' => 300, ], ], </code></pre> <p>The separation matters: a long-running document job shouldn't starve a quick password-reset email.</p> <h3 id="heading-failed-jobs-and-retries">Failed Jobs and Retries</h3> <p>Retries only help when failures are temporary. Retrying a job that's permanently broken just burns capacity. For jobs with a business deadline, use <code>retryUntil</code>:</p> <pre><code class="language-php">use DateTime; use Throwable; public function retryUntil(): DateTime { return now()->addMinutes(30); } public function failed(Throwable $exception): void { ImportBatch::whereKey($this->batchId)->update([ 'status' => 'failed', 'failed_reason' => $exception->getMessage(), ]); } </code></pre> <p>Use <code>failed</code> to flag the problem somewhere a human will see it. Whatever you do, don't set unlimited retries on jobs that hit a third-party service.</p> <h3 id="heading-queue-monitoring">Queue Monitoring</h3> <p>Track queue depth, wait time, failure rate, and processing time together. Depth alone can mislead you. When depth starts climbing, walk through it methodically: are workers keeping pace with incoming jobs? If the queue keeps growing, check how long individual jobs take. If the slow part is the database, fix the query or dial back worker concurrency. If it's an external API, add backoff or a circuit breaker. If the work is CPU-bound, scale workers or break the jobs into smaller pieces.</p> <p>Be careful with the "scale workers" instinct, though. Adding more workers without checking the database first can make an incident worse. More workers mean more concurrent queries, more locks, and more pressure on the primary exactly when it's already struggling.</p> <h2 id="heading-how-to-optimize-api-performance">How to Optimize API Performance</h2> <p>APIs earn special attention because clients call them repeatedly and payloads tend to grow quietly over months.</p> <h3 id="heading-api-resources">API Resources</h3> <p>Resources keep your response shape intentional:</p> <pre><code class="language-php">class OrderResource extends JsonResource { public function toArray($request): array { return [ 'id' => $this->id, 'status' => $this->status, 'total' => $this->total, 'placed_at' => $this->created_at->toIso8601String(), 'customer' => new CustomerSummaryResource($this->whenLoaded('customer')), ]; } } </code></pre> <p><code>whenLoaded</code> is doing real work here. It stops the resource from quietly triggering a lazy query when the relationship wasn't eager loaded:</p> <pre><code class="language-php">$orders = Order::query() ->with('customer:id,name') ->where('account_id', $accountId) ->latest() ->paginate(50); return OrderResource::collection($orders); </code></pre> <h3 id="heading-pagination">Pagination</h3> <p>Returning unbounded collections is an easy way to create an API performance problem you won't notice until a client has a lot of data:</p> <pre><code class="language-php">$perPage = min((int) request('per_page', 50), 100); $orders = Order::where('account_id', $accountId) ->latest() ->paginate($perPage); </code></pre> <p>Cap the page size. If a client genuinely needs every record for an export, make that an async job rather than a giant synchronous response.</p> <h3 id="heading-response-optimization">Response Optimization</h3> <p>Stop returning fields nobody reads. On read-heavy endpoints, selecting only the columns you need cuts both database I/O and serialization cost:</p> <pre><code class="language-php">$products = Product::query() ->select(['id', 'name', 'slug', 'price', 'thumbnail_url']) ->where('is_visible', true) ->orderBy('name') ->paginate(40); </code></pre> <p>It's also worth turning on compression at the web server or load balancer. JSON compresses extremely well, and that's often a small config change with a real bandwidth payoff.</p> <h3 id="heading-rate-limiting">Rate Limiting</h3> <p>Design API rate limits around identity and endpoint cost:</p> <pre><code class="language-php">Route::middleware(['auth:sanctum', 'throttle:api']) ->group(function () { Route::get('/orders', [OrderController::class, 'index']); Route::post('/exports/orders', [OrderExportController::class, 'store']) ->middleware('throttle:exports'); }); </code></pre> <p>This keeps casual browsing and expensive exports under separate policies, so one heavy user can't squeeze out everyone else.</p> <h3 id="heading-caching-api-responses">Caching API Responses</h3> <p>Cache responses that are expensive to compute and can tolerate being a little stale:</p> <pre><code class="language-php">public function index(Request $request) { $accountId = $request->user()->account_id; $page = $request->integer('page', 1); $cacheKey = "api:accounts:{$accountId}:orders:v1:page:{$page}"; return Cache::remember($cacheKey, now()->addSeconds(60), function () use ($accountId) { return OrderResource::collection( Order::with('customer:id,name') ->where('account_id', $accountId) ->latest() ->paginate(50) )->response()->getData(true); }); } </code></pre> <p>Notice the <code>v1</code> in the key. Bumping that version number lets you invalidate an entire response format at once when the shape changes. Always scope the key to the tenant or user for anything that's not truly global.</p> <h2 id="heading-how-to-monitor-laravel-in-production">How to Monitor Laravel in Production</h2> <p>The teams that catch problems before customers do are the ones collecting signals from everywhere: Laravel, queues, the database, Redis, the infrastructure, and external services.</p> <p>Laravel gives you several good starting points. Horizon shows queue throughput, failed jobs, wait times, and worker balancing. Telescope surfaces request details, queries, exceptions, jobs, mail, and cache events. Your logs capture slow operations, unexpected retries, and external failures. Your metrics track latency, error rate, queue depth, job runtime, database CPU, lock waits, cache hit ratio, and Redis memory. Your alerting ties all of it back to something a customer would actually feel.</p> <p>That last part is where teams often make mistakes. The best alerts are about symptoms, not machines being busy: p95 API latency over 800ms for 10 minutes, checkout error rate above 1%, the emails queue waiting more than 5 minutes, database CPU over 85% with slow queries rising, Redis memory over 80%, or failed payment webhooks crossing a threshold.</p> <p>A useful mental model is this: logs tell you what happened, metrics tell you whether the system is healthy, and traces tell you where the time went. In practice, wrapping your expensive business operations in a bit of instrumentation pays off quickly:</p> <pre><code class="language-php">use Illuminate\Support\Facades\Log; $startedAt = microtime(true); $report = $builder->forAccount($account)->build(); Log::info('Billing report generated', [ 'account_id' => $account->id, 'duration_ms' => (int) ((microtime(true) - $startedAt) * 1000), 'invoice_count' => $report->invoiceCount(), ]); </code></pre> <p>When something is failing at 2am, a log line like that can tell you which account, import, or report is causing the pressure.</p> <p>One more thing worth internalizing: monitor wait time, not just throughput. A queue can process thousands of jobs a minute and still be unhealthy if important jobs sit waiting too long before they start. Users feel the wait, not the throughput.</p> <h2 id="heading-an-example-high-traffic-laravel-architecture">An Example High-Traffic Laravel Architecture</h2> <p>A high-traffic Laravel setup generally separates four things: stateless web requests, shared cache and session storage, asynchronous workers, and database roles.</p> <p>Users hit a load balancer, which spreads traffic across a fleet of stateless Laravel app servers. Those servers use Redis for cache, sessions, rate limits, queues, and Horizon data. Queue workers handle slow or unreliable work off to the side. A MySQL primary takes all writes and any consistency-sensitive reads, while a read replica absorbs read-heavy endpoints that can tolerate some replication lag.</p> <p>The flow looks like this:</p> <pre><code class="language-text">Users -> Load balancer -> Stateless Laravel app servers -> Redis for cache, sessions, rate limits, queues, and Horizon data -> Primary database for writes and consistency-sensitive reads -> Read replica for safe read-heavy endpoints Redis queue -> Queue workers -> Database, external APIs, mail providers, object storage, and other services </code></pre> <p>This isn't the only valid shape. PostgreSQL can stand in for MySQL, Amazon SQS can replace Redis queues, a CDN can serve static assets and cache public responses, and object storage should hold user uploads. The principle that matters is that each layer has one clear job and can be scaled or tuned on its own.</p> <p>The flip side of stateless app servers is that anything a user needs after the request ends has to live in shared storage. Uploads, generated files, and session state shouldn't sit on a single server's local disk, or they may disappear from the user's point of view when the load balancer sends the next request somewhere else.</p> <h2 id="heading-lessons-learned-the-hard-way">Lessons Learned the Hard Way</h2> <h3 id="heading-1-premature-optimization">1. Premature Optimization</h3> <p>This usually shows up as elaborate infrastructure built before the app has any real visibility into itself.</p> <p>The practical path works better: measure, rank the bottlenecks, fix the biggest one, repeat. For most Laravel apps, the first round of scaling is mostly indexes, N+1 fixes, queue separation, and trimming payloads.</p> <h3 id="heading-2-over-caching">2. Over-caching</h3> <p>Caching can make a system faster and harder to reason about at the same time. One team cached an account-settings response for 30 minutes, then later folded role changes into that same response. The result was that users who had just lost access could still see features until the cache expired.</p> <p>The fix was splitting stable account metadata away from permission-sensitive state. The lesson is to avoid caching authorization data unless you have thought carefully about invalidation.</p> <h3 id="heading-3-missing-indexes">3. Missing Indexes</h3> <p>These hide until a table crosses a size threshold. A query that scanned 20,000 rows in development can scan 20 million in production. Bake index review into feature work, and plan big index migrations carefully so they don't lock a hot table at the worst possible time.</p> <h3 id="heading-4-queue-overload">4. Queue Overload</h3> <p>Queues don't remove work, they move it. The classic failure is letting one noisy workload block everything else. A big CSV import floods the default queue, and password-reset emails get stuck behind it. Separate queues are cheap insurance against that entire class of incident.</p> <h3 id="heading-5-large-transactions">5. Large Transactions</h3> <p>Long transactions hold locks longer and make failures more expensive. Dispatching a job inside a transaction is especially risky because a worker can grab it before the transaction commits:</p> <pre><code class="language-php">DB::transaction(function () use ($request) { $order = Order::create([...]); $order->items()->createMany($request->items); GenerateInvoicePdf::dispatch($order->id); SyncOrderToCrm::dispatch($order->id); }); </code></pre> <p>Use after-commit dispatching for any job that depends on committed data:</p> <pre><code class="language-php">GenerateInvoicePdf::dispatch($order->id)->afterCommit(); SyncOrderToCrm::dispatch($order->id)->afterCommit(); </code></pre> <p>Keep transactions scoped to the data that genuinely has to change atomically, and nothing more.</p> <h3 id="heading-6-treating-symptoms-as-causes">6. Treating Symptoms as Causes</h3> <p>This is the expensive one. If latency is high because an endpoint runs 300 queries, adding app servers adds database pressure. If jobs are slow because an external API is rate-limiting you, adding workers multiplies the failures.</p> <p>Good scaling work keeps asking the same questions: What resource is saturated? Which endpoint, job, tenant, or query is causing it? Is this work necessary during the request? Can I reduce it, defer it, cache it, or isolate it? How will I know whether the change helped?</p> <h2 id="heading-a-pre-launch-scaling-checklist">A Pre-Launch Scaling Checklist</h2> <p>Run through this before a big launch, a traffic campaign, or an enterprise rollout.</p> <p><strong>Application and runtime:</strong> Cache config, routes, and views during deploy. Set <code>APP_DEBUG=false</code>. Turn on OPcache. Keep web requests short and move slow work to queues. Store uploads in object storage, not on app-server disk. Keep servers stateless. Set timeouts on every external HTTP call.</p> <p><strong>Database:</strong> Review slow query logs first. Add indexes for your high-volume filters, joins, and ordering. Hunt for N+1 queries in controllers, resources, policies, and views. Paginate every list endpoint. Use <code>chunkById</code> or cursors for batch work. Avoid long transactions and external calls inside transactions. Confirm your backup and restore process works. Test stale-read behavior if you use replicas.</p> <p><strong>Redis and cache:</strong> Use Redis for cache, sessions, rate limiting, and queues where it fits. Set TTLs unless you have a clear reason not to. Include tenant, user, locale, and version in keys when relevant. Watch memory and the eviction policy. Avoid caching permission-sensitive responses without careful invalidation. Guard against cache stampedes on expensive recomputation.</p> <p><strong>Queues:</strong> Separate queues by workload. Configure Horizon supervisors per queue. Set timeouts, retries, and backoff on purpose. Make jobs idempotent where you can. Use <code>afterCommit</code> for jobs that depend on committed data. Monitor wait time, runtime, failures, and retries. Review failed jobs instead of ignoring them.</p> <p><strong>APIs:</strong> Use Resources to control response shape. Cap <code>per_page</code>. Use cursor pagination for big feeds and logs. Cache expensive reads with safe, versioned keys and short TTLs. Apply rate limits by endpoint cost. Don't return raw Eloquent models. Compress responses at the edge.</p> <p><strong>Observability:</strong> Track p50, p95, and p99 latency on the endpoints that matter. Track error rates by route and job class. Alert on queue wait time, not just size. Watch database CPU, connections, slow queries, and lock waits. Watch Redis memory, latency, and evictions. Log important business operations with durations and identifiers. Test your alerts before launch night because a silent alert is worse than no alert.</p> <h2 id="heading-conclusion">Conclusion</h2> <p>Laravel runs high-traffic production systems well when you design around the real costs of data, concurrency, and external dependencies. Just make sure you measure before you optimize, because guessing wastes time and tends to complicate the wrong layer.</p> <p>Fix the database first: indexes, query shape, pagination, and eager loading usually deliver the biggest early wins. Lean on queues to keep requests fast and push slow work into controlled background workers. Cache deliberately, with clear keys, sane TTLs, and a plan for invalidation. Keep watching latency, errors, queue wait time, database health, Redis memory, and your external dependencies.</p> <p>The best scaling work is practical and repeatable. You study the system you actually have, remove waste, isolate slow parts, and give yourself enough visibility to make the next change with confidence. Do that on a loop, and you rarely need the big rewrite.</p> <h2 id="heading-references">References</h2> <ul> <li><p><a href="https://laravel.com/docs/eloquent-relationships">Laravel documentation: Eloquent relationships</a></p> </li> <li><p><a href="https://laravel.com/docs/queries">Laravel documentation: Database queries</a></p> </li> <li><p><a href="https://laravel.com/docs/cache">Laravel documentation: Cache</a></p> </li> <li><p><a href="https://laravel.com/docs/queues">Laravel documentation: Queues</a></p> </li> <li><p><a href="https://laravel.com/docs/redis">Laravel documentation: Redis</a></p> </li> <li><p><a href="https://laravel.com/docs/routing#rate-limiting">Laravel documentation: Rate limiting</a></p> </li> <li><p><a href="https://laravel.com/docs/eloquent-resources">Laravel documentation: Eloquent API resources</a></p> </li> <li><p><a href="https://laravel.com/docs/horizon">Laravel Horizon documentation</a></p> </li> <li><p><a href="https://laravel.com/docs/telescope">Laravel Telescope documentation</a></p> </li> <li><p><a href="https://dev.mysql.com/doc/refman/8.4/en/optimization.html">MySQL documentation: Optimization</a></p> </li> <li><p><a href="https://redis.io/docs/latest/">Redis documentation</a></p> </li> </ul> </article> <article> <h1> How to Start your Career in Tech with freeCodeCamp - Full Talk in Spanish </h1> <p>Estefania Cassingena Navone — Thu, 11 Jun 2026 15:03:17 +0000</p> <p>Technology is rapidly reshaping the world. Understanding how to use free learning resources and contribute to open source projects can be very helpful to start your career in this field.</p> <p>We just published a talk on the freeCodeCamp Spanish YouTube channel about how to leverage freeCodeCamp's free learning resources and community to start your career in technology. You’ll learn how to find these resources and the core concepts that you need to know to start contributing to open source.</p> <p>If you have Spanish-speaking friends, you're welcome to share the <a href="https://www.freecodecamp.org/espanol/news/como-iniciar-tu-carrera-en-tecnologia-con-freecodecamp">Spanish version of this article</a> with them.</p> <p>This talk was presented by Estefania (me!). I'm part of the freeCodeCamp team. I develop educational content and manage the freeCodeCamp Spanish YouTube channel. I love helping others learn and grow professionally. I gave this virtual talk for the Escuela Superior Politécnica del Litoral, located in Guayaquil, Ecuador.</p> <h2 id="heading-why-consider-a-career-in-tech">Why Consider a Career in Tech?</h2> <p>Before we dive into the content of the talk, let's see what careers in technology involve, and why you should start now if having a career in this field is your goal.</p> <p>Technology careers can be very interesting because they give you the opportunity to solve real-world problems. Being part of the technology field means that you'll have the chance to help shape the future of humanity. By joining the freeCodeCamp community, you'll find the support and the practical tools you need to get started.</p> <p>Learning to code and contributing to open source projects, like freeCodeCamp, are two fundamental steps for reaching your goal. Contributing to open source projects develops your technical skills and connects you with a global network of developers who share your interests and goals. By contributing to open source, you’ll gain hands-on experience, improve your portfolio, and increase your chances of landing your first job.</p> <h2 id="heading-what-youll-learn-during-the-talk">What You'll Learn During the Talk</h2> <p>Great. Let's see what you’ll learn during the talk:</p> <ul> <li><p>The story and mission of freeCodeCamp.org.</p> </li> <li><p>How to use freeCodeCamp’s free learning resources to learn programming.</p> </li> <li><p>freeCodeCamp's certifications.</p> </li> <li><p>freeCodeCamp's daily coding challenges.</p> </li> <li><p>freeCodeCamp's catalog and forums.</p> </li> <li><p>freeCodeCamp's YouTube channels with full courses.</p> </li> <li><p>freeCodeCamp's publication.</p> </li> <li><p>Additional free learning resources provided by freeCodeCamp.</p> </li> <li><p>How to contribute to open source projects and why this is important.</p> </li> <li><p>Common terminology used in open source projects.</p> </li> <li><p>Personal tips for getting started in your first job.</p> </li> <li><p>How to join the freeCodeCamp community.</p> </li> </ul> <p>By the end of the talk, you’ll know how to find and leverage freeCodeCamp's free learning resources to learn programming and start your career.</p> <h2 id="heading-talk-on-youtube">Talk on YouTube</h2> <p>Check out the talk on the freeCodeCamp Spanish YouTube channel:</p> <div class="embed-wrapper"></div> <p>✍️ Talk presented by Estefania Cassingena Navone.</p> <ul> <li>YouTube: <a href="https://www.youtube.com/@freecodecampes">https://www.youtube.com/@freecodecampes</a></li> </ul> <p>Collaborating with:</p> <ul> <li><p>Canal del Capítulo (IEEE Computer Society ESPOL): <a href="https://www.youtube.com/@ieeeespolcomputersociety7497">https://www.youtube.com/@ieeeespolcomputersociety7497</a></p> </li> <li><p>Canal de la Rama Estudiantil (IEEE ESPOL): <a href="https://www.youtube.com/@ramaestudiantilieee-espol84">https://www.youtube.com/@ramaestudiantilieee-espol84</a></p> </li> </ul> </article> <article> <h1> Web Scraping for Beginners 2026 </h1> <p>Beau Carnes — Wed, 10 Jun 2026 02:16:49 +0000</p> <p>If you have ever wanted to collect product data, monitor competitors, track SEO rankings, or build AI tools that pull information from the internet, you have likely run into the common frustrations of web scraping: broken scripts, rate limits, bot detection, and tedious CAPTCHAs.</p> <p>We just published a new tutorial on the freeCodeCamp.org YouTube channel, featuring software developer and course creator Ania Kubow.</p> <p>In this comprehensive, beginner-friendly course, Ania teaches you a much simpler, more efficient approach. Instead of building scrapers from scratch, you will learn how to leverage an API to handle the heavy lifting for you.</p> <p>Throughout this tutorial, you will master the following:</p> <ul> <li><p>How to bypass web scraping obstacles like bot protection and rate limits using a powerful API.</p> </li> <li><p>How to extract structured JSON data directly from search engines like Google, Amazon, YouTube, and more.</p> </li> <li><p>How to use the Google Lens API to scrape images and visual matches.</p> </li> <li><p>How to build your own functional web application that searches for and downloads content locally to your computer.</p> </li> </ul> <p>By the end of this video, you will have the knowledge and the basic code necessary to turn internet data into actionable insights for your own projects.</p> <p>Watch the full tutorial on <a href="https://youtu.be/j6hnjNhx_MM">the freeCodeCamp.org YouTube channel</a> (1 -hour watch).</p> <div class="embed-wrapper"></div> </article> <article> <h1> How to Build a PostgreSQL-Backed Job Queue in Go </h1> <p>timothy ogbemudia — Tue, 09 Jun 2026 23:21:55 +0000</p> <p>When you build a web application, not every task should happen inside a user's request.</p> <p>Some work is slow. Some work can fail. Some work should happen later. Sending emails, resizing images, processing webhooks, generating reports, and retrying third-party APIs are all good examples.</p> <p>These tasks are usually handled by a background job system.</p> <p>In this article, you'll use an open source Go project called <a href="https://github.com/glamboyosa/swig">Swig</a> as a practical example of how a PostgreSQL-backed job queue works in practice.</p> <p>By the end, you'll understand how to build a background job queue with Go and PostgreSQL, and why PostgreSQL is more capable than most developers realize.</p> <h2 id="heading-table-of-contents">Table of Contents</h2> <ol> <li><p><a href="#heading-prerequisites">Prerequisites</a></p> </li> <li><p><a href="#heading-what-you-will-learn">What You Will Learn</a></p> </li> <li><p><a href="#heading-what-is-a-job-queue">What Is a Job Queue?</a></p> </li> <li><p><a href="#heading-why-use-postgresql-for-a-queue">Why Use PostgreSQL for a Queue?</a></p> </li> <li><p><a href="#heading-swigs-architecture">Swig's Architecture</a></p> </li> <li><p><a href="#heading-how-to-represent-jobs-in-postgresql">How to Represent Jobs in PostgreSQL</a></p> </li> <li><p><a href="#heading-how-to-define-a-worker-in-go">How to Define a Worker in Go</a></p> </li> <li><p><a href="#heading-how-to-register-workers-without-sharing-state">How to Register Workers Without Sharing State</a></p> </li> <li><p><a href="#heading-how-to-add-a-job">How to Add a Job</a></p> </li> <li><p><a href="#heading-how-to-handle-multiple-workers-safely">How to Handle Multiple Workers Safely</a></p> </li> <li><p><a href="#heading-how-to-use-goroutines-for-concurrent-workers">How to Use Goroutines for Concurrent Workers</a></p> </li> <li><p><a href="#heading-how-to-wake-workers-with-listennotify">How to Wake Workers with LISTEN/NOTIFY</a></p> </li> <li><p><a href="#heading-how-to-elect-a-leader-with-advisory-locks">How to Elect a Leader with Advisory Locks</a></p> </li> <li><p><a href="#heading-how-to-handle-failed-jobs">How to Handle Failed Jobs</a></p> </li> <li><p><a href="#heading-how-to-abstract-the-database-driver">How to Abstract the Database Driver</a></p> </li> <li><p><a href="#heading-conclusion">Conclusion</a></p> </li> </ol> <h2 id="heading-prerequisites">Prerequisites</h2> <p>To follow along, you should have:</p> <ul> <li><p>Basic familiarity with Go (structs, interfaces, goroutines)</p> </li> <li><p>A working understanding of PostgreSQL and SQL</p> </li> <li><p>Go installed (1.21 or later)</p> </li> <li><p>A PostgreSQL instance available locally or remotely</p> </li> </ul> <h2 id="heading-what-you-will-learn">What You Will Learn</h2> <ul> <li><p>How to represent and store jobs in PostgreSQL</p> </li> <li><p>How to claim jobs safely across concurrent workers using <code>FOR UPDATE SKIP LOCKED</code></p> </li> <li><p>How to wake workers efficiently using <code>LISTEN/NOTIFY</code></p> </li> <li><p>How to elect a leader across instances using advisory locks</p> </li> <li><p>How Go interfaces, goroutines, contexts, and transactions fit together in a real system</p> </li> </ul> <h2 id="heading-what-is-a-job-queue">What Is a Job Queue?</h2> <p>A job queue is a system that stores work to be done later.</p> <p>Your application adds a job to the queue. A worker takes a job from the queue and runs it.</p> <p>For example, when a user signs up, your application might create the user immediately and then add a job like this:</p> <pre><code class="language-json">{ "kind": "send_welcome_email", "payload": { "to": "user@example.com", "subject": "Welcome!" } } </code></pre> <p>A background worker later picks up that job and sends the email. This keeps the user request fast. The signup route doesn't need to wait for the email provider before returning a response.</p> <p>A job queue usually needs to answer a few important questions:</p> <ul> <li><p>Where are jobs stored?</p> </li> <li><p>How do workers find jobs?</p> </li> <li><p>How do you stop two workers from processing the same job?</p> </li> <li><p>How do you retry failed jobs?</p> </li> <li><p>How do you shut workers down safely?</p> </li> <li><p>How do you keep job creation consistent with application data?</p> </li> </ul> <p>Swig answers those questions with Go and PostgreSQL.</p> <h2 id="heading-why-use-postgresql-for-a-queue">Why Use PostgreSQL for a Queue?</h2> <p>Many job queues use Redis, RabbitMQ, SQS, or Kafka. Those are all useful tools. But many applications already depend on PostgreSQL. If your app already has Postgres, you may not want to operate another service just to run background jobs.</p> <p>PostgreSQL gives you several features that are surprisingly useful for queues:</p> <ul> <li><p>Tables for durable job storage</p> </li> <li><p>Transactions for atomic writes</p> </li> <li><p>Row locks for safe concurrent processing</p> </li> <li><p><code>SKIP LOCKED</code> for letting workers claim different jobs</p> </li> <li><p><code>LISTEN/NOTIFY</code> for waking workers when new jobs arrive</p> </li> <li><p>Advisory locks for leader election</p> </li> <li><p>JSONB for flexible job payloads</p> </li> </ul> <p>The tradeoff is important. A PostgreSQL-backed queue isn't trying to replace Kafka for event streaming or RabbitMQ for complex routing. It makes common application background jobs simple, reliable, and easy to operate without adding infrastructure.</p> <h2 id="heading-swigs-architecture">Swig's Architecture</h2> <p>At a high level, Swig has five parts:</p> <ol> <li><p>A <code>swig_jobs</code> table in PostgreSQL</p> </li> <li><p>Go workers that process jobs</p> </li> <li><p>A worker registry that maps job names to worker types</p> </li> <li><p>A driver layer that supports both <code>pgx</code> and <code>database/sql</code></p> </li> <li><p>A leader loop for shared maintenance work</p> </li> </ol> <p>The basic flow looks like this:</p> <ol> <li><p>Your app calls <code>AddJob</code></p> </li> <li><p>Swig serializes the job payload to JSON</p> </li> <li><p>Swig inserts a row into <code>swig_jobs</code></p> </li> <li><p>PostgreSQL sends a notification that a job was created</p> </li> <li><p>A Go worker wakes up and tries to claim one pending job</p> </li> <li><p>PostgreSQL row locks ensure only one worker claims that row</p> </li> <li><p>The worker runs the job</p> </li> <li><p>Swig marks the job as completed or failed</p> </li> </ol> <p>The hard parts are concurrency, failure, connection lifecycle, and shutdown. That's where Go and PostgreSQL work together.</p> <h2 id="heading-how-to-represent-jobs-in-postgresql">How to Represent Jobs in PostgreSQL</h2> <p>A simplified version of Swig's job table looks like this:</p> <pre><code class="language-sql">CREATE TABLE swig_jobs ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), kind TEXT NOT NULL, queue TEXT NOT NULL, payload JSONB NOT NULL, status TEXT NOT NULL DEFAULT 'pending', priority INTEGER NOT NULL DEFAULT 0, attempts INTEGER NOT NULL DEFAULT 0, max_attempts INTEGER NOT NULL DEFAULT 3, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), scheduled_for TIMESTAMPTZ NOT NULL DEFAULT NOW(), instance_id UUID, worker_id UUID, locked_at TIMESTAMPTZ, last_error TEXT, last_error_at TIMESTAMPTZ ); </code></pre> <p>Each row is one job. The important columns are:</p> <ul> <li><p><code>kind</code>: the type of job, such as <code>send_email</code></p> </li> <li><p><code>payload</code>: the JSON data needed to run the job</p> </li> <li><p><code>status</code>: whether the job is pending, processing, completed, or failed</p> </li> <li><p><code>attempts</code>: how many times the job has been tried</p> </li> <li><p><code>scheduled_for</code>: when the job is allowed to run</p> </li> <li><p><code>locked_at</code>: when the job was claimed</p> </li> </ul> <p>The table is the source of truth. PostgreSQL notifications can wake workers, but notifications aren't the durable queue. The rows in <code>swig_jobs</code> are.</p> <h2 id="heading-how-to-define-a-worker-in-go">How to Define a Worker in Go</h2> <p>In Swig, a worker is a Go type that knows how to process one kind of job.</p> <p>Here's a simple email worker:</p> <pre><code class="language-go">type EmailWorker struct { To string `json:"to"` Subject string `json:"subject"` Body string `json:"body"` } func (w *EmailWorker) JobName() string { return "send_email" } func (w *EmailWorker) Process(ctx context.Context) error { fmt.Printf("Sending email to %s with subject %s\n", w.To, w.Subject) return nil } </code></pre> <p>There are two important methods:</p> <ul> <li><p><code>JobName</code> tells Swig what kind of job this worker handles</p> </li> <li><p><code>Process</code> contains the actual work</p> </li> </ul> <p>The struct fields are also the job arguments. When you enqueue an <code>EmailWorker</code>, Swig serializes the struct into JSON and stores it in PostgreSQL. Later, a worker claims the row, unmarshals the JSON back into a fresh <code>EmailWorker</code>, and calls <code>Process</code>.</p> <h3 id="heading-go-interfaces">Go Interfaces</h3> <p>Go interfaces describe behavior. Swig doesn't need to know the exact concrete type of every worker. It only needs to know that a worker can provide a job name and process a job:</p> <pre><code class="language-go">type Worker interface { JobName() string Process(context.Context) error } </code></pre> <p>If a type has those methods, it satisfies the interface with no explicit declaration required. This is one of the reasons interfaces are so useful in Go. They let you design around behavior instead of inheritance.</p> <h2 id="heading-how-to-register-workers-without-sharing-state">How to Register Workers Without Sharing State</h2> <p>Swig has a worker registry that maps a job name to a worker type:</p> <pre><code class="language-go">registry := workers.NewWorkerRegistry() registry.RegisterWorker(&EmailWorker{}) </code></pre> <p>Later, when a job row says <code>kind = 'send_email'</code>, Swig looks up the registered worker and runs it.</p> <p>There's a subtle concurrency issue here. If the registry stored the exact <code>&EmailWorker{}</code> pointer and reused it for every job, multiple goroutines could unmarshal payloads into the same Go value at the same time.</p> <p>Swig avoids this with a factory approach internally. Registration captures the worker type, and each claimed job gets a fresh worker instance before JSON is unmarshaled. The API stays simple, but internally Swig creates a new <code>EmailWorker</code> for each job. This is a useful Go pattern: keep the public API simple while making the internal lifecycle safer.</p> <h2 id="heading-how-to-add-a-job">How to Add a Job</h2> <p>Here's what adding a job looks like from the user side:</p> <pre><code class="language-go">err := swigClient.AddJob(ctx, &EmailWorker{ To: "user@example.com", Subject: "Welcome!", Body: "Thanks for signing up.", }) </code></pre> <p>Inside Swig, the process is roughly:</p> <pre><code class="language-go">argsJSON, err := json.Marshal(workerWithArgs) if err != nil { return err } _, err = db.ExecContext(ctx, ` INSERT INTO swig_jobs (kind, queue, payload, priority, scheduled_for, status) VALUES ($1, $2, $3, $4, $5, 'pending') `, jobName, queue, argsJSON, priority, runAt) </code></pre> <h3 id="heading-how-to-enqueue-jobs-inside-transactions">How to Enqueue Jobs Inside Transactions</h3> <p>One of the best reasons to use PostgreSQL for jobs is transactional enqueueing.</p> <p>Imagine a user signs up. You want to insert the user and queue a welcome email. If those happen separately, you can get inconsistent states. With a transaction, both succeed or both fail:</p> <pre><code class="language-go">tx, err := pool.Begin(ctx) if err != nil { return err } defer tx.Rollback(ctx) _, err = tx.Exec(ctx, `INSERT INTO users (email) VALUES ($1)`, email) if err != nil { return err } err = swigClient.AddJobWithTx(ctx, tx, &EmailWorker{ To: email, Subject: "Welcome!", Body: "Thanks for joining.", }) if err != nil { return err } return tx.Commit(ctx) </code></pre> <p>If the transaction rolls back, the user isn't created and the job isn't queued. This is much harder to guarantee when your database and queue are separate systems.</p> <h2 id="heading-how-to-handle-multiple-workers-safely">How to Handle Multiple Workers Safely</h2> <p>A queue gets interesting when many workers run at the same time. Imagine three workers all asking PostgreSQL for the next pending job. You don't want all three to process the same job.</p> <p>A naïve approach has a race condition. Two workers can select the same job before either one updates it.</p> <h3 id="heading-postgresql-for-update-skip-locked">PostgreSQL FOR UPDATE SKIP LOCKED</h3> <p>PostgreSQL can lock rows selected inside a transaction. <code>FOR UPDATE</code> means "lock this row because I plan to update it." <code>SKIP LOCKED</code> means "if another worker already locked a row, skip it and find another one."</p> <p>This is perfect for a queue:</p> <ul> <li><p>Worker A locks job 1</p> </li> <li><p>Worker B skips job 1 and locks job 2</p> </li> <li><p>Worker C skips jobs 1 and 2 and locks job 3</p> </li> </ul> <p>No central coordinator is needed. Swig uses an atomic update pattern:</p> <pre><code class="language-sql">UPDATE swig_jobs SET status = 'processing', instance_id = $1, worker_id = $2, locked_at = NOW(), attempts = attempts + 1 WHERE id = ( SELECT id FROM swig_jobs WHERE status = 'pending' AND scheduled_for <= NOW() ORDER BY priority DESC, created_at FOR UPDATE SKIP LOCKED LIMIT 1 ) RETURNING id, kind, payload; </code></pre> <p>This query finds a pending job, skips already-locked jobs, marks it as processing, records which worker claimed it, and returns the job data. All of this happens atomically. Workers never do a separate <code>SELECT</code> and hope the later <code>UPDATE</code> is still safe.</p> <h2 id="heading-how-to-use-goroutines-for-concurrent-workers">How to Use Goroutines for Concurrent Workers</h2> <p>Swig starts worker loops as goroutines:</p> <pre><code class="language-go">for i := 0; i < maxWorkers; i++ { go s.startWorker(ctx, queueType) } </code></pre> <p>Each worker runs independently. PostgreSQL coordinates which job each worker gets. Go handles concurrency with goroutines, while PostgreSQL handles safe job claiming with locks.</p> <h3 id="heading-how-to-handle-graceful-shutdown">How to Handle Graceful Shutdown</h3> <p>When a service shuts down, it should wait for workers to finish cleanly. Go's <code>sync.WaitGroup</code> helps:</p> <pre><code class="language-go">var wg sync.WaitGroup wg.Add(1) go func() { defer wg.Done() processJobs() }() wg.Wait() </code></pre> <p>Swig also uses <code>sync.Once</code> to make shutdown idempotent. Calling <code>Stop</code> more than once shouldn't panic because of a double channel close. Shutdown paths are often where production systems behave differently from happy-path demos.</p> <h2 id="heading-how-to-wake-workers-with-listennotify">How to Wake Workers with LISTEN/NOTIFY</h2> <p>If workers constantly poll the database for jobs, they waste resources when the queue is empty. PostgreSQL has <code>LISTEN/NOTIFY</code> to solve this.</p> <p>A connection can listen on a channel:</p> <pre><code class="language-sql">LISTEN swig_jobs; </code></pre> <p>Another session can send a notification:</p> <pre><code class="language-sql">NOTIFY swig_jobs, '{"id":"job-id"}'; </code></pre> <p>Swig creates a trigger so PostgreSQL sends a notification after a job is inserted. Workers sleep when there's no work and wake when a new job arrives.</p> <p>There's an important PostgreSQL detail here: <code>LISTEN</code> is session-scoped. A worker must wait for notifications on the same database session that executed <code>LISTEN</code>. Swig handles this by creating a dedicated listener for each worker that owns one database session throughout its lifecycle.</p> <p>This is a common backend engineering lesson: abstractions like connection pools are useful, but some database features depend on the lifecycle of a specific connection.</p> <h2 id="heading-how-to-elect-a-leader-with-advisory-locks">How to Elect a Leader with Advisory Locks</h2> <p>Some queue maintenance tasks should only run on one instance at a time, including retrying failed jobs, recovering stale jobs, and cleaning old history.</p> <p>Swig uses PostgreSQL advisory locks for this:</p> <pre><code class="language-sql">SELECT pg_try_advisory_lock($1); </code></pre> <p>If the result is true, that Swig instance becomes the leader. Advisory locks are also session-scoped, so Swig uses a dedicated advisory-lock connection for leadership. If that session ends, PostgreSQL releases the lock and another instance can take over. Simple failover without ZooKeeper or etcd.</p> <h2 id="heading-how-to-handle-failed-jobs">How to Handle Failed Jobs</h2> <p>When a worker returns an error, Swig records the error and either retries the job or marks it as failed:</p> <pre><code class="language-sql">UPDATE swig_jobs SET status = CASE WHEN attempts >= max_attempts THEN 'failed' ELSE 'pending' END, last_error = $2, last_error_at = NOW() WHERE id = $1; </code></pre> <h3 id="heading-a-note-on-delivery-semantics">A Note on Delivery Semantics</h3> <p>It's tempting to say a job queue processes jobs exactly once. In distributed systems, that's a dangerous claim.</p> <p>Consider this scenario:</p> <ol> <li><p>A worker sends an email</p> </li> <li><p>The worker crashes before marking the job completed</p> </li> <li><p>The job is retried</p> </li> <li><p>The email might be sent again</p> </li> </ol> <p>The accurate description is that Swig provides atomic claiming and at-least-once processing. Because jobs can be retried, workers should be idempotent. Running the same operation more than once should produce the same result as running it once.</p> <h2 id="heading-how-to-abstract-the-database-driver">How to Abstract the Database Driver</h2> <p>Swig supports both <code>pgx</code> and <code>database/sql</code> through a driver interface:</p> <pre><code class="language-go">type Driver interface { Exec(ctx context.Context, sql string, args ...interface{}) error Query(ctx context.Context, sql string, args ...interface{}) (Rows, error) QueryRow(ctx context.Context, sql string, args ...interface{}) Row WithTx(ctx context.Context, fn func(tx Transaction) error) error NewListener(ctx context.Context, channel string) (Listener, error) TryAdvisoryLock(ctx context.Context, lockID int64) (AdvisoryLock, bool, error) } </code></pre> <p>The core queue code only depends on behavior, not a specific library. This is a common Go design: define the behavior your core package needs, write small adapters for concrete dependencies, and keep the core logic independent.</p> <h2 id="heading-conclusion">Conclusion</h2> <p>A PostgreSQL-backed queue isn't the right answer for every system. If you need massive event streaming, Kafka may be a better fit. If you need complex routing, RabbitMQ may be better.</p> <p>But for many Go applications, PostgreSQL is already there. Swig shows how far you can get with a small Go API and a few PostgreSQL features:</p> <ul> <li><p>Store jobs in a table</p> </li> <li><p>Claim jobs atomically with <code>FOR UPDATE SKIP LOCKED</code></p> </li> <li><p>Wake workers with dedicated <code>LISTEN/NOTIFY</code> sessions</p> </li> <li><p>Coordinate leadership with advisory locks</p> </li> <li><p>Keep app data and jobs consistent with transactions</p> </li> <li><p>Manage worker lifecycles with goroutines and contexts</p> </li> </ul> <p>That combination makes a solid foundation for background processing and a great project for learning how Go and PostgreSQL work together in production systems. You can explore the full source code at <a href="https://github.com/glamboyosa/swig">github.com/glamboyosa/swig</a>.</p> </article> <article> <h1> Open Source Tools Every STEM Student Should Know About </h1> <p>Manish Shivanandhan — Tue, 09 Jun 2026 06:14:32 +0000</p> <p>Technology has changed the way students learn science, mathematics, engineering, and computer science.</p> <p>A decade ago, most STEM students depended on textbooks, calculators, and expensive licensed software. Today, open source tools have made advanced learning resources available to anyone with an internet connection.</p> <p>Many of these tools are powerful enough for professional researchers and software engineers, yet simple enough for students who are just getting started. They help with coding, data analysis, mathematics, technical writing, visualization, collaboration, and project management.</p> <p>In this article, we'll look at seven open source tools that can help STEM students study more effectively, build projects faster, and develop industry-ready technical skills.</p> <h3 id="heading-what-well-cover">What We'll Cover:</h3> <ul> <li><p><a href="#heading-why-open-source-tools-matter-for-stem-students">Why Open Source Tools Matter for STEM Students</a></p> </li> <li><p><a href="#heading-jupyter-notebook-for-interactive-learning">Jupyter Notebook for Interactive Learning</a></p> </li> <li><p><a href="#heading-vs-code-for-programming-and-technical-projects">VS Code for Programming and Technical Projects</a></p> </li> <li><p><a href="#heading-geogebra-for-mathematics-visualization">GeoGebra for Mathematics Visualization</a></p> </li> <li><p><a href="#heading-git-and-github-for-collaboration">Git and GitHub for Collaboration</a></p> </li> <li><p><a href="#heading-blender-for-scientific-and-engineering-visualization">Blender for Scientific and Engineering Visualization</a></p> </li> <li><p><a href="#heading-obs-studio-for-recording-and-presentations">OBS Studio for Recording and Presentations</a></p> </li> <li><p><a href="#heading-how-open-source-tools-build-career-skills">How Open Source Tools Build Career Skills</a></p> </li> <li><p><a href="#heading-the-future-of-stem-education">The Future of STEM Education</a></p> </li> <li><p><a href="#heading-final-thoughts">Final Thoughts</a></p> </li> </ul> <h2 id="heading-why-open-source-tools-matter-for-stem-students"><strong>Why Open Source Tools Matter for STEM Students</strong></h2> <p>Open source software is more than just free software. It gives students access to the underlying code, community support, and the freedom to experiment without restrictions.</p> <p>This matters because STEM education is becoming increasingly hands-on. Employers expect students to understand practical workflows, not just theory. Learning how to use modern tools early can make the transition into internships and engineering roles much easier.</p> <p>Open source ecosystems also evolve quickly. Students can explore real-world technologies used in research labs, startups, and large engineering organizations. Many of these environments also rely on <a href="https://www.pulseofstrategy.com/best-n8n-alternatives/">open-source automation</a> tools to simplify development workflows and improve collaboration across technical teams.</p> <h2 id="heading-jupyter-notebook-for-interactive-learning"><strong>Jupyter Notebook for Interactive Learning</strong></h2> <p>One of the most important tools for STEM students is <a href="https://jupyter.org/">Jupyter Notebook</a>.</p> <p>Jupyter Notebook allows users to combine code, mathematical equations, visualizations, and notes inside a single interactive document. This makes it extremely useful for subjects like data science, physics, statistics, and machine learning.</p> <p>A student can write Python code, run calculations, and immediately visualize the output using graphs or tables. Instead of switching between multiple applications, everything exists in one place.</p> <p>For example, a physics student can simulate motion equations, while a statistics student can analyze datasets directly inside the notebook.</p> <p>Jupyter is widely used in universities and research institutions because it supports experimentation and iterative learning.</p> <h2 id="heading-vs-code-for-programming-and-technical-projects"><strong>VS Code for Programming and Technical Projects</strong></h2> <p><a href="https://code.visualstudio.com/">Visual Studio Code</a> has become one of the most popular development environments in the world. Although it is developed by Microsoft, it's built on open source technologies and supports a massive extension ecosystem.</p> <p>For STEM students, VS Code is valuable because it supports nearly every major programming language. Whether you're learning Python, JavaScript, C++, or Rust, the editor provides debugging, syntax highlighting, terminal integration, and Git support in one interface.</p> <p>Engineering students often work across multiple disciplines. A robotics student might write Python scripts, configure embedded systems, and document experiments all in the same environment.</p> <p>VS Code also integrates well with Jupyter Notebook, making it an excellent all-in-one workspace for technical learning.</p> <h2 id="heading-geogebra-for-mathematics-visualization"><strong>GeoGebra for Mathematics Visualization</strong></h2> <p>Mathematics becomes easier when students can visualize concepts instead of memorizing formulas.</p> <p><a href="https://www.geogebra.org/">GeoGebra</a> is an open source mathematics platform that helps students explore algebra, geometry, calculus, and statistics through interactive graphs and simulations.</p> <p>Students can manipulate equations dynamically and observe how graphs change in real time. This creates a much deeper understanding of mathematical relationships.</p> <p>Interactive visualisation tools are especially useful for students preparing for advanced mathematics courses. Popular teaching platforms like <a href="https://brighterly.com/">Brighterly</a> who are known as a great precalculus tutor, use graphing platforms like GeoGebra to better understand trigonometric functions, transformations, and polynomial behaviour. The platform is also useful for individual teachers who want to create interactive lessons instead of relying entirely on static diagrams.</p> <h2 id="heading-git-and-github-for-collaboration"><strong>Git and GitHub for Collaboration</strong></h2> <p>Version control is one of the most important technical skills students can learn.</p> <p><a href="https://git-scm.com/">Git</a> is an open source version control system that helps developers track changes in code and collaborate efficiently. It is widely used across software engineering, data science, and research projects.</p> <p>Students often lose work because they overwrite files or create confusing project versions. Git solves this problem by maintaining a complete history of changes.</p> <p>When paired with <a href="https://github.com/">GitHub</a>, students can collaborate on projects, contribute to open source repositories, and build a public portfolio of technical work.</p> <p>This is especially valuable for computer science students applying for internships or engineering roles. Recruiters frequently review GitHub profiles to evaluate coding ability and project experience.</p> <p>Even students outside traditional software engineering fields benefit from Git. Researchers use it for reproducible experiments, while engineering teams use it to manage technical documentation and simulation code.</p> <h2 id="heading-blender-for-scientific-and-engineering-visualization"><strong>Blender for Scientific and Engineering Visualization</strong></h2> <p>Most people associate Blender with animation and game design, but it's also a powerful tool for STEM applications.</p> <p><a href="https://www.blender.org/">Blender</a> is an open source 3D modeling and rendering platform used in industries ranging from architecture to scientific visualization.</p> <p>Engineering students can use Blender to create product prototypes, mechanical visualizations, and simulation renders. Biology students can build anatomical models, while physics students can visualize complex systems in three dimensions.</p> <p>Visualization plays a major role in technical understanding. A well-designed 3D model can explain concepts that are difficult to communicate through text alone.</p> <p>Blender also teaches valuable spatial reasoning and design skills that are increasingly useful in fields like robotics, manufacturing, and augmented reality.</p> <h2 id="heading-obs-studio-for-recording-and-presentations"><strong>OBS Studio for Recording and Presentations</strong></h2> <p>Modern STEM learning is becoming more collaborative and content-driven.</p> <p>Students now create tutorials, record presentations, explain coding projects, and participate in online learning communities. <a href="https://obsproject.com/">OBS Studio</a> is an open source tool that allows users to record screens, stream presentations, and create technical demonstrations.</p> <p>This is particularly useful for students building portfolios or preparing project walkthroughs.</p> <p>For example, a software engineering student can record a demo of a web application, while a mathematics student can create video explanations of problem-solving methods.</p> <p>OBS Studio is lightweight, flexible, and widely used by educators, developers, and technical creators.</p> <h2 id="heading-how-open-source-tools-build-career-skills"><strong>How Open Source Tools Build Career Skills</strong></h2> <p>One of the biggest advantages of open source tools is that they mirror real industry workflows.</p> <p>Students aren't just learning academic concepts. They're learning systems used in professional engineering environments.</p> <p>A student who understands Git, VS Code, Jupyter, and collaborative development practices already has exposure to modern software engineering workflows. Similarly, students using Blender or GeoGebra are developing visualization and analytical skills that transfer into technical careers.</p> <p>Open source communities also encourage experimentation. Students can inspect source code, contribute fixes, participate in discussions, and learn directly from experienced developers around the world.</p> <p>This creates a more active learning process than simply consuming tutorials.</p> <h2 id="heading-the-future-of-stem-education"><strong>The Future of STEM Education</strong></h2> <p>STEM education is shifting toward project-based and interdisciplinary learning.</p> <p>Students are expected to solve problems, communicate ideas clearly, and adapt to rapidly evolving technologies. Open source tools make this possible by lowering financial barriers and giving students access to professional-grade software.</p> <p>The rise of artificial intelligence, data science, and remote collaboration has also increased the importance of technical self-learning. Students who can independently explore tools and build projects will have a significant advantage in both academics and industry.</p> <p>The good news is that modern open source ecosystems make this easier than ever before. A student with a laptop and internet connection can now access tools that were once available only to large universities or research organizations.</p> <h2 id="heading-final-thoughts"><strong>Final Thoughts</strong></h2> <p>The best STEM students aren't always the ones with the most expensive hardware or software. Often, they're the ones who learn how to use accessible tools creatively and consistently.</p> <p>Platforms like Jupyter Notebook, VS Code, GeoGebra, LibreOffice, Git, Blender, and OBS Studio provide a strong foundation for technical learning across many disciplines.</p> <p>More importantly, these tools encourage curiosity, experimentation, and practical problem-solving. Those skills matter far beyond the classroom.</p> <p>As STEM education continues to evolve, students who embrace open source technology will be better prepared for research, engineering, software development, and the increasingly interdisciplinary future of technical work.</p> </article> <article> <h1> Database Version Control with Liquibase and Spring Boot </h1> <p>Ashutosh Krishna — Tue, 09 Jun 2026 02:29:12 +0000</p> <p>Picture this familiar scenario: you're working on a new feature that requires a new database column. You open your local database client, write an <code>ALTER TABLE</code> statement, and execute it. Your code works perfectly. You commit the Java code, push it to the repository, and go grab a coffee.</p> <p>A few hours later, a teammate pulls your branch, runs the application, and everything crashes.</p> <p>"Hey," they ask across the room (or in a Slack channel), "did you change the database?"</p> <p>You quickly realize you forgot to share the SQL script. You paste it into the chat. They run it. Everything works. Then, a week later, the deployment to the staging environment fails for the exact same reason. By the time this code reaches production, everyone is asking a variation of the same terrified question: "Which SQL script should I run?"</p> <p>This situation is called schema drift. It happens when the state of your database diverges across different environments. Staging has one schema, production has another, and every developer's local machine is a unique snowflake of untested database modifications.</p> <p>Managing database changes manually is a recipe for deployment headaches and team collaboration challenges. Application code is stateless and easy to replace. Databases are stateful. Databases have surprisingly good memories, and they rarely forget a bad migration.</p> <p>Liquibase solves this problem by bringing version-control discipline to your database changes. Instead of passing around SQL files and hoping people remember to run them, you define your database changes in code. These changes travel with your application repository and execute automatically.</p> <p>Here is a high-level look at how this architecture works:</p> <p>Think about the journey of a single database change. A developer commits their database migration alongside their Java code into Git. When the CI/CD pipeline (or a teammate) pulls that code, the Spring Boot application starts. But before the app fully boots up and accepts web traffic, Liquibase intercepts the process. It acts as a gatekeeper, connecting to the database and applying the required schema changes. This ensures the database exactly matches the code's expectations before a single user makes a request.</p> <h2 id="heading-why-database-version-control-matters">Why Database Version Control Matters</h2> <p>If you've spent any time working on team-based applications, you've probably seen a folder structure that looks exactly like this:</p> <pre><code class="language-plaintext">project-sql-scripts/ ├── create_employee_table.sql ├── create_employee_table_final.sql ├── create_employee_table_final_v2.sql ├── add_email_column.sql ├── latest.sql └── definitely_latest_use_this_one.sql </code></pre> <p>The phrase "just run this SQL script manually" has launched many memorable incidents.</p> <p>When you rely on manual database updates, you guarantee failure at scale. Onboarding a new developer becomes an archeological expedition to figure out how to build the local schema. Deployments become stressful events requiring a checklist of manual queries that must be run in a highly specific order.</p> <p>Version-controlled database changes treat your schema as code. When your database changes live alongside your application logic, you gain several immediate benefits:</p> <ul> <li><p><strong>Consistency:</strong> Every environment (local, staging, production) applies the exact same changes in the exact same order.</p> </li> <li><p><strong>Safety:</strong> You eliminate the human error of skipping a script or running an outdated query.</p> </li> <li><p><strong>Visibility:</strong> You can look at a Git commit and see exactly how the Java code and the database schema changed together to support a new feature.</p> </li> </ul> <p>Git solved version control for code. Liquibase helps prevent databases from becoming the rebellious sibling.</p> <h2 id="heading-what-is-liquibase">What is Liquibase?</h2> <p>At its core, Liquibase is a database migration tool that tracks and applies schema changes in a predictable and repeatable way.</p> <p>Instead of writing loose SQL scripts, you write "migrations" (also called changeSets). Liquibase reads these files, compares them against a tracking table inside your actual database, and figures out exactly what needs to be executed to bring the database up to date.</p> <p>To use Liquibase effectively, you only need to understand a few conceptual terms:</p> <ul> <li><p><strong>changeLog:</strong> The master file. This is essentially a list that tells Liquibase which migration files to execute and in what order.</p> </li> <li><p><strong>changeSet:</strong> A single, atomic change to your database. Creating a table is one changeSet. Adding a column is another.</p> </li> <li><p><strong>Migration History:</strong> A table Liquibase automatically creates in your database (called <code>DATABASECHANGELOG</code>) to remember which changeSets have already been executed.</p> </li> <li><p><strong>Checksums:</strong> A unique hash generated for every changeSet. Liquibase uses this to detect if someone secretly modified a file after it was already executed.</p> </li> </ul> <p>When you integrate Liquibase with Spring Boot, the migration process happens completely automatically during the application startup phase.</p> <p>During startup, Liquibase takes control before your web server is allowed to receive HTTP traffic. It reaches into the database and checks the tracking table to see which migrations have already run. If it finds new migrations in your local files, it locks the database to prevent concurrent updates, executes the changes, records the new history, and finally releases the lock. Only after this entire process completes does Spring Boot finish booting up.</p> <p>Because Liquibase runs before Spring Boot fully initializes the web server, your application will never serve traffic with an outdated database schema. If a migration fails, the application fails to start, protecting your system from entering a broken state.</p> <h2 id="heading-project-setup">Project Setup</h2> <p>Now that you understand the theory, let's build something real. We're going to build the database layer for an Employee Management API.</p> <p>For this project we'll use:</p> <ul> <li><p>Java 17+</p> </li> <li><p>Spring Boot 3.x</p> </li> <li><p>Maven</p> </li> <li><p>Liquibase</p> </li> <li><p>H2 Database</p> </li> </ul> <p>We're using H2 because it's an in-memory database that requires zero installation. You can run this project immediately without configuring Docker containers or installing database servers. But everything you learn here applies exactly the same way to PostgreSQL, MySQL, SQL Server, or Oracle.</p> <p>If you're generating this project via <a href="https://start.spring.io/">Spring Initializr</a>, select the following dependencies: Spring Web, Spring Data JPA, Liquibase Migration, and H2 Database.</p> <p>In your <code>pom.xml</code>, you'll see the critical dependencies that make this work:</p> <pre><code class="language-xml"><dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-jpa</artifactId> </dependency> <dependency> <groupId>com.h2database</groupId> <artifactId>h2</artifactId> <scope>runtime</scope> </dependency> <dependency> <groupId>org.liquibase</groupId> <artifactId>liquibase-core</artifactId> </dependency> </dependencies> </code></pre> <p>Next, configure Spring Boot to talk to H2 and find your Liquibase files. Open your <code>src/main/resources/application.properties</code> file and add the following:</p> <pre><code class="language-plaintext"># H2 Database Configuration spring.datasource.url=jdbc:h2:file:./data/employeedb;DB_CLOSE_DELAY=-1 spring.datasource.driverClassName=org.h2.Driver spring.datasource.username=sa spring.datasource.password= # Enable H2 Console to inspect the database in your browser spring.h2.console.enabled=true spring.h2.console.path=/h2-console # Liquibase Configuration spring.liquibase.change-log=classpath:db/changelog/db.changelog-master.xml </code></pre> <p>That last line is the most important. It tells Spring Boot exactly where to find the "master list" of your database changes.</p> <p>Note: We're using a file-based H2 database instead of an in-memory database. The problem with an in-memory database is that it completely wipes itself clean every time you restart Spring Boot.</p> <p>While Liquibase will happily rebuild the schema from scratch on every boot, a <strong>file-based</strong> database is much better for this tutorial (and for real-world local development). With a file-based database, your data, and more importantly, your Liquibase history, will actually persist between application restarts.</p> <h2 id="heading-understanding-core-liquibase-concepts">Understanding Core Liquibase Concepts</h2> <p>Before we write our first table, we need to understand how Liquibase organizes files. Liquibase uses a hierarchical structure.</p> <p>Think of it like a book. The <code>changeLog</code> is the table of contents, and the <code>changeSets</code> are the actual chapters.</p> <ol> <li><p><strong>The Master ChangeLog:</strong> This is the entry point. It rarely contains actual database changes. Instead, its only job is to include other files in a specific order.</p> </li> <li><p><strong>Child ChangeLogs:</strong> These group related changes together.</p> </li> <li><p><strong>ChangeSets:</strong> These are the actual, atomic database commands (like creating a table or adding a column).</p> </li> </ol> <p>Here's a visual breakdown of how this hierarchy works in a real Spring Boot project:</p> <p>Liquibase organizes migrations hierarchically. You maintain a single master file that acts as a table of contents. This master file rarely holds actual SQL commands. Instead, it explicitly includes child XML files in a strict execution order. Each of those child files (like <code>01-create-employees.xml</code>) contains one or more individual database commands, which Liquibase calls changeSets.</p> <p>A <code>changeSet</code> is uniquely identified by three things:</p> <ul> <li><p><strong>id:</strong> A unique string (often a number or a Jira ticket ID).</p> </li> <li><p><strong>author:</strong> The person who wrote the migration.</p> </li> <li><p><strong>file path:</strong> Where the file is located.</p> </li> </ul> <p>When Liquibase runs, it looks at a <code>changeSet</code>, calculates a cryptographic hash of its contents (a checksum), and records the id, author, and checksum in the database. If it sees that exact combination of id, author, and file path in the database again on the next startup, it skips it.</p> <h2 id="heading-create-the-initial-employee-schema-version-1">Create the Initial Employee Schema (Version 1)</h2> <p>Let's write our first version. We need a table to store employees.</p> <p>First, create the master file at <code>src/main/resources/db/changelog/db.changelog-master.xml</code>:</p> <pre><code class="language-xml"><?xml version="1.0" encoding="UTF-8"?> <databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-4.20.xsd"> <include file="db/changelog/changes/01-create-employees.xml"/> </databaseChangeLog> </code></pre> <p>Next, create the actual migration file at <code>src/main/resources/db/changelog/changes/01-create-employees.xml</code>:</p> <pre><code class="language-xml"><?xml version="1.0" encoding="UTF-8"?> <databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-4.20.xsd"> <changeSet id="1" author="ashutoshkrris"> <createTable tableName="employees"> <column name="id" type="BIGINT" autoIncrement="true"> <constraints primaryKey="true" nullable="false"/> </column> <column name="first_name" type="VARCHAR(50)"> <constraints nullable="false"/> </column> <column name="last_name" type="VARCHAR(50)"> <constraints nullable="false"/> </column> </createTable> </changeSet> </databaseChangeLog> </code></pre> <p>Let's look at what we just did. We defined a <code>changeSet</code> with an <code>id</code> of "1" and an <code>author</code> of "ashutoshkrris". Inside, we used Liquibase's XML syntax to define a table.</p> <p>Why use XML instead of plain SQL? Because Liquibase is database-agnostic. This exact XML will generate the correct auto-increment syntax for PostgreSQL (<code>SERIAL</code>), MySQL (<code>AUTO_INCREMENT</code>), or Oracle (<code>IDENTITY</code>). You define the structure, and Liquibase translates it to the specific database dialect.</p> <p>Now, run your Spring Boot application. Watch your terminal output. You'll see logs similar to this:</p> <p>Liquibase realized the database was empty. It automatically created its tracking table (<code>DATABASECHANGELOG</code>), read our <code>changeSet</code>, executed the table creation, and recorded the event.</p> <p>If you restart the application right now, Liquibase will run again. But this time, it'll check the <code>DATABASECHANGELOG</code> table, see that <code>id="1"</code> and <code>author="ashutoshkrris"</code> has already been executed, and silently skip it. Your database is now safely version-controlled.</p> <h2 id="heading-what-just-happened">What Just Happened?</h2> <p>Up to this point, Liquibase might feel a bit like magic. You dropped an XML file into a folder, started Spring Boot, and your database schema transformed.</p> <p>But understanding how Liquibase actually works under the hood is critical. If you understand the startup sequence, you'll know exactly how to debug deployments when things eventually go wrong.</p> <p>When your Spring Boot application starts, it doesn't immediately begin accepting web requests. First, it initializes its internal components. When it creates the Liquibase component, the migration process begins.</p> <p>Here's exactly what happens during that startup phase:</p> <p>Let's trace the exact sequence. When Spring Boot initializes Liquibase, the very first thing the tool does is query the lock table to ensure no other application instance is currently migrating the database. If the coast is clear, it claims the lock. It then calculates cryptographic checksums for your local XML files, compares them against the database history, executes any missing changes, and logs them. Finally, it releases the lock so the Tomcat web server can safely start.</p> <p>This sequence guarantees that your application will never serve a user request before the database schema is completely ready to handle it.</p> <h2 id="heading-inspecting-the-database-liquibase-metadata-tables">Inspecting the Database: Liquibase Metadata Tables</h2> <p>Let's look at what this history and locking actually looks like inside the database itself. Since we configured the H2 Console earlier, we can inspect the raw tables.</p> <p>While your Spring Boot application is running, open your browser and navigate to <code>http://localhost:8080/h2-console</code>. Connect using the JDBC URL <code>jdbc:h2:file:./data/employeedb</code> with the username <code>sa</code> and a blank password.</p> <p>Inside, you'll see your <code>employees</code> table. You'll also see two extra tables created automatically by Liquibase: <code>DATABASECHANGELOG</code> and <code>DATABASECHANGELOGLOCK</code>.</p> <h3 id="heading-the-databasechangelog-table">The <code>DATABASECHANGELOG</code> Table</h3> <p>This table is the brain of your migration strategy. It acts as the permanent ledger of every database change ever applied to this environment.</p> <p>If you run <code>SELECT * FROM DATABASECHANGELOG;</code>, you'll see output that looks like this:</p> <table> <thead> <tr> <th>ID</th> <th>AUTHOR</th> <th>FILENAME</th> <th>DATEEXECUTED</th> <th>ORDEREXECUTED</th> <th>EXECTYPE</th> <th>MD5SUM</th> <th>DESCRIPTION</th> <th>COMMENTS</th> <th>TAG</th> <th>LIQUIBASE</th> <th>CONTEXTS</th> <th>LABELS</th> <th>DEPLOYMENT_ID</th> </tr> </thead> <tbody><tr> <td>1</td> <td>ashutoshkrris</td> <td>db/changelog/changes/01-create-employees.xml</td> <td>2026-05-30 13:11:35.937919</td> <td>1</td> <td>EXECUTED</td> <td>9:66e7dcffb2b1902a4e9f01670cb5f192</td> <td>createTable tableName=employees</td> <td></td> <td><em>null</em></td> <td>4.31.1</td> <td><em>null</em></td> <td><em>null</em></td> <td>0126894849</td> </tr> </tbody></table> <p>Let's break down the most important columns:</p> <ul> <li><p><strong>ID, AUTHOR, FILENAME:</strong> These three columns form a composite key. Together, they uniquely identify a single migration.</p> </li> <li><p><strong>DATEEXECUTED & ORDEREXECUTED:</strong> Tells you exactly when a script ran and in what sequence.</p> </li> <li><p><strong>MD5SUM:</strong> This is the cryptographic hash of your XML file. When Liquibase starts, it hashes your local XML file and compares it to this column. If you secretly edit a file after it's been executed, this hash won't match, and Liquibase will crash the startup to protect your database.</p> </li> <li><p><strong>EXECTYPE:</strong> Most of the time, this simply says <code>EXECUTED</code>. But it provides a crucial audit trail: if you use Liquibase commands to intentionally skip a migration but record it as finished, you'll see <code>MARK_RAN</code>. If a migration was skipped because its preconditions failed, you'll see <code>SKIPPED</code>.</p> </li> <li><p><strong>TAG:</strong> Think of this as a Git tag for your database schema. Before a major, high-risk deployment, you can configure Liquibase to "tag" the current state of the database (for example, <code>v1.4.0</code>). If the deployment fails catastrophically, you can trigger a rollback command telling Liquibase to undo every change applied after the <code>v1.4.0</code> tag.</p> </li> <li><p><strong>CONTEXTS:</strong> This is how you manage environment-specific changes. By adding a context attribute to your changeSet (for example, <code><changeSet id="7" author="ashutoshkrris" context="dev, qa"></code>), that migration will only execute if Spring Boot passes "dev" or "qa" to Liquibase on startup. Production will safely ignore it.</p> </li> <li><p><strong>LABELS:</strong> While Contexts target environments, Labels target categories of work. You can label a changeSet with a Jira ticket number (<code>issue-842</code>) or a release train (<code>Q3-release</code>). This allows advanced teams to selectively execute or roll back specific subsets of features without affecting the rest of the database.</p> </li> </ul> <h3 id="heading-the-databasechangeloglock-table">The <code>DATABASECHANGELOGLOCK</code> Table</h3> <p>This table is tiny, but it plays a massive role in modern deployments.</p> <p>If you run <code>SELECT * FROM DATABASECHANGELOGLOCK;</code>, you'll see a single row:</p> <table> <thead> <tr> <th>ID</th> <th>LOCKED</th> <th>LOCKGRANTED</th> <th>LOCKEDBY</th> </tr> </thead> <tbody><tr> <td>1</td> <td>FALSE</td> <td><em>null</em></td> <td><em>null</em></td> </tr> </tbody></table> <p>Imagine you're deploying your Spring Boot application to a Kubernetes cluster. You tell Kubernetes to spin up three identical instances simultaneously. All three instances connect to the exact same database.</p> <p>If all three instances try to run the <code>CREATE TABLE</code> migration at the exact same millisecond, your database will throw concurrency errors. The lock table prevents this. The very first instance to reach the database sets <code>LOCKED</code> to <code>TRUE</code>. The other two instances check the table, see the lock, and politely wait.</p> <p><strong>Practical Troubleshooting Tip:</strong> Sometimes, a deployment fails catastrophically mid-migration (perhaps the server lost power). When this happens, Liquibase might die before it can set <code>LOCKED</code> back to <code>FALSE</code>.</p> <p>The next time you start the application, the logs will hang indefinitely, repeating: <code>Waiting for changelog lock....</code></p> <p>If you're absolutely certain no other applications are currently running migrations, you can manually fix this by running a simple SQL command in your database client:</p> <pre><code class="language-sql">UPDATE DATABASECHANGELOGLOCK SET LOCKED = FALSE; </code></pre> <p>This forces the lock open, allowing your application to resume.</p> <h2 id="heading-evolving-the-employee-api">Evolving the Employee API</h2> <p>Software is never finished. Two weeks after your successful Version 1 deployment, the business team comes back with new requirements.</p> <p>Because you now understand how Liquibase tracks history, evolving the database is simple. You just append new files to your master list.</p> <h3 id="heading-version-2-adding-an-email-field">Version 2: Adding an Email Field</h3> <p>The HR team needs to contact employees. You need an email column.</p> <p>Create a new file at <code>src/main/resources/db/changelog/changes/02-add-employee-email.xml</code>:</p> <pre><code class="language-xml"><?xml version="1.0" encoding="UTF-8"?> <databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-4.20.xsd"> <changeSet id="2" author="ashutoshkrris"> <addColumn tableName="employees"> <column name="email" type="VARCHAR(100)"> <constraints nullable="false" unique="true"/> </column> </addColumn> </changeSet> </databaseChangeLog> </code></pre> <p>Add this to your <code>db.changelog-master.xml</code> file immediately below your first include:</p> <pre><code class="language-xml"><include file="db/changelog/changes/02-add-employee-email.xml"/> </code></pre> <p>When you restart the application, Liquibase checks the <code>DATABASECHANGELOG</code> table. It sees that <code>id="1"</code> is already there, so it skips it. It sees <code>id="2"</code> is missing, so it executes it and adds a new row to the tracking table.</p> <h3 id="heading-version-3-adding-departments-support">Version 3: Adding Departments Support</h3> <p>The company is growing. Employees now belong to departments. You need a <code>departments</code> table and a foreign key constraint linking the two.</p> <p>Create <code>03-add-departments.xml</code>:</p> <pre><code class="language-xml"><?xml version="1.0" encoding="UTF-8"?> <databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-4.20.xsd"> <changeSet id="3" author="ashutoshkrris"> <createTable tableName="departments"> <column name="id" type="BIGINT" autoIncrement="true"> <constraints primaryKey="true" nullable="false"/> </column> <column name="name" type="VARCHAR(50)"> <constraints nullable="false" unique="true"/> </column> </createTable> </changeSet> <changeSet id="4" author="ashutoshkrris"> <addColumn tableName="employees"> <column name="department_id" type="BIGINT"/> </addColumn> <addForeignKeyConstraint baseTableName="employees" baseColumnNames="department_id" constraintName="fk_employee_department" referencedTableName="departments" referencedColumnNames="id"/> </changeSet> </databaseChangeLog> </code></pre> <p>Notice that we used two separate changeSets in one file. This is a best practice. Each changeSet represents one logical operation. If the foreign key creation (id="4") fails, the department table creation (id="3") will still be recorded as successful, and only id="4" will roll back.</p> <h3 id="heading-version-4-amp-5-employee-status-and-performance-indexes">Version 4 & 5: Employee Status and Performance Indexes</h3> <p>Finally, HR wants to track active versus inactive staff, and the database team noticed that searching by last name is getting slow.</p> <p>Create <code>04-status-and-indexes.xml</code>:</p> <pre><code class="language-xml"><?xml version="1.0" encoding="UTF-8"?> <databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-4.20.xsd"> <changeSet id="3" author="ashutoshkrris"> <createTable tableName="departments"> <column name="id" type="BIGINT" autoIncrement="true"> <constraints primaryKey="true" nullable="false"/> </column> <column name="name" type="VARCHAR(50)"> <constraints nullable="false" unique="true"/> </column> </createTable> </changeSet> <changeSet id="4" author="ashutoshkrris"> <addColumn tableName="employees"> <column name="department_id" type="BIGINT"/> </addColumn> <addForeignKeyConstraint baseTableName="employees" baseColumnNames="department_id" constraintName="fk_employee_department" referencedTableName="departments" referencedColumnNames="id"/> </changeSet> </databaseChangeLog> </code></pre> <p>Remember to add all new files to your <code>db.changelog-master.xml</code>. The order of your include statements is the exact order Liquibase will execute them.</p> <h2 id="heading-the-golden-rule-never-modify-executed-changesets">The Golden Rule: Never Modify Executed ChangeSets</h2> <p>Eventually, a developer on your team will look at your <code>01-create-employees.xml</code> file and notice a mistake. Perhaps they spot a typo in a column name, or perhaps they realize a column is missing a strict non-null constraint.</p> <p>Their instinct, based on years of writing standard Java code, will be to open that XML file, fix the mistake, save the file, and restart the application.</p> <p>Let's actually do this and see what happens.</p> <p>Open your <code>src/main/resources/db/changelog/changes/01-create-employees.xml</code> file. Change the <code>first_name</code> column to <code>given_name</code>:</p> <pre><code class="language-xml"><column name="given_name" type="VARCHAR(50)"> <constraints nullable="false"/> </column> </code></pre> <p>Save the file and restart your Spring Boot application.</p> <p>Instead of a smooth startup, your application will instantly crash, and your terminal will vomit a massive stack trace. Look closely at the top of the error logs. You should see this exact message:</p> <pre><code class="language-shell">Caused by: liquibase.exception.ValidationFailedException: Validation Failed: 1 changesets check sum db/changelog/changes/01-create-employees.xml::1::ashutoshkrris was: 9:66e7dcffb2b1902a4e9f01670cb5f192 but is now: 9:2bd3ef21343d3b5c9448cc50bc35deef </code></pre> <p>Here's why this happens. Once a changeSet runs against an environment, it becomes immutable history. You can't change the past.</p> <p>When Liquibase starts up, it calculates a cryptographic hash (an MD5 checksum) of your local XML file. It then queries the <code>DATABASECHANGELOG</code> table and compares the freshly calculated hash against the hash that was recorded when the file originally executed.</p> <p>If you change even a single character in a file that has already been executed, the hash changes. Liquibase detects the tampering and refuses to start. It does this to protect your data. If your XML code says a column is named <code>first_name</code> but the database was originally built using <code>fist_name</code>, your Spring Data JPA repositories are going to fail anyway.</p> <h3 id="heading-how-to-fix-it-the-right-way">How to Fix It (The Right Way)</h3> <p>If you made this mistake locally, you might be tempted to go into your database, delete the row from the <code>DATABASECHANGELOG</code> table, and try again. Don't do this. If this code reaches staging or production, you can't manually delete rows on production servers.</p> <p>The correct way to fix a schema mistake is to <strong>roll forward</strong>.</p> <p>First, undo your change in <code>01-create-employees.xml</code> so the hash matches the database again. Then, write a brand new changeSet to apply the fix:</p> <pre><code class="language-xml"><changeSet id="7" author="ashutosh"> <renameColumn tableName="employees" oldColumnName="first_name" newColumnName="given_name" columnDataType="VARCHAR(50)"/> </changeSet> </code></pre> <p>Include it in your master changelog, restart the application, and the database will safely evolve to the correct state.</p> <h2 id="heading-working-with-seed-data">Working with Seed Data</h2> <p>Sometimes, a schema change requires initial data to be useful.</p> <p>For example, in Version 3, we created a <code>departments</code> table. Right now, that table is completely empty. When a new developer clones the repository and spins up the project locally, they have to manually write SQL <code>INSERT</code> statements just to test the API.</p> <p>We can automate this by making baseline data insertion part of our migration strategy.</p> <p>Create a new file at <code>src/main/resources/db/changelog/changes/05-seed-departments.xml</code>:</p> <pre><code class="language-xml"><?xml version="1.0" encoding="UTF-8"?> <databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-4.20.xsd"> <changeSet id="8" author="ashutoshkrris"> <insert tableName="departments"> <column name="name" value="Engineering"/> </insert> <insert tableName="departments"> <column name="name" value="Human Resources"/> </insert> <insert tableName="departments"> <column name="name" value="Finance"/> </insert> </changeSet> </databaseChangeLog> </code></pre> <p>Add the include statement to your <code>db.changelog-master.xml</code> file. When you restart the application, Liquibase will insert these rows. Your API is now instantly usable out of the box.</p> <h3 id="heading-the-danger-of-data-migrations">The Danger of Data Migrations</h3> <p>While seeding data is powerful, it requires discipline. Here is a practical engineering rule of thumb:</p> <p><strong>Do use Liquibase for:</strong></p> <ul> <li><p>Static lookup tables (status codes, country lists, default departments).</p> </li> <li><p>System configuration flags required for the application to boot.</p> </li> </ul> <p><strong>Do NOT use Liquibase for:</strong></p> <ul> <li><p>Generating thousands of fake users for testing.</p> </li> <li><p>Migrating massive amounts of transactional data (for example, moving 5 million records from one table to another).</p> </li> </ul> <p>Large data migrations can lock up database tables for hours. If you lock a core table during a deployment, your application will experience a massive outage. Keep your changeSets focused on schema structure and essential baseline data. Use dedicated scripts or background jobs for heavy data manipulation.</p> <h2 id="heading-rollbacks">Rollbacks</h2> <p>In a perfect world, code always works. In reality, you'll eventually deploy a database change that breaks a critical production query or corrupts data. When this happens, you need a way to hit the undo button.</p> <p>Liquibase supports rollbacks, but you have to understand how it interprets them.</p> <h3 id="heading-automatic-vs-explicit-rollbacks">Automatic vs. Explicit Rollbacks</h3> <p>Many Liquibase commands are automatically reversible. For example, if you write a changeSet to <code><createTable></code> or <code><addColumn></code>, Liquibase implicitly knows that the opposite of adding a column is dropping a column. You don't have to tell it how to undo these actions.</p> <p>But some operations are inherently destructive or ambiguous. If you use custom <code><sql></code> tags, or if you use <code><dropTable></code>, Liquibase has no idea how to put the data back. In these cases, you must provide explicit rollback instructions.</p> <p>Let's simulate a scenario where we add a temporary access code column, but we want to ensure we know exactly how to remove it safely.</p> <p>Create <code>06-temporary-access.xml</code>:</p> <pre><code class="language-xml"><?xml version="1.0" encoding="UTF-8"?> <databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-4.20.xsd"> <changeSet id="9" author="ashutosh"> <addColumn tableName="employees"> <column name="temp_access_code" type="VARCHAR(10)"/> </addColumn> <rollback> <dropColumn tableName="employees" columnName="temp_access_code"/> </rollback> </changeSet> </databaseChangeLog> </code></pre> <p>Add this to your master file and run the application. The column is added.</p> <p>If you were deploying this via a CI/CD pipeline and the deployment failed, you could trigger a Liquibase Maven command to roll back by a specific number of steps (for example, <code>mvn liquibase:rollback -Dliquibase.rollbackCount=1</code>), or roll back to a specific tag we discussed earlier.</p> <h3 id="heading-the-reality-check-on-rollbacks">The Reality Check on Rollbacks</h3> <p>While it's important to know how rollbacks work, here's a practical reality from the trenches of backend engineering: <strong>Rollbacks are often discussed but rarely executed cleanly in production.</strong></p> <p>Dropping a column is mathematically easy. Recovering the customer data that was written to that column during the 15 minutes the bad code was live is incredibly difficult.</p> <p>Because of this, modern engineering teams often prefer a "roll forward" strategy. If a migration causes an issue, instead of running a scary database rollback command, they quickly write a new changeSet that fixes the issue (for example, adding a missing index or relaxing a constraint) and deploy the application again.</p> <p>It's highly recommended to design your database changes to be additive and non-destructive to avoid needing complex rollbacks in the first place.</p> <h2 id="heading-common-beginner-mistakes">Common Beginner Mistakes</h2> <p>Adopting database version control is a massive step forward for any engineering team, but it comes with a learning curve. When developers transition from writing loose SQL scripts to using Liquibase, they tend to fall into a few predictable traps.</p> <p>Here are the most common beginner mistakes and exactly how to avoid them.</p> <h3 id="heading-1-the-mega-changeset">1. The "Mega" ChangeSet</h3> <p>When starting out, it's tempting to dump your entire initial schema into a single XML file under a single <code>changeSet</code>. You might put 15 <code>createTable</code> statements and 20 <code>addForeignKeyConstraint</code> statements into <code>id="1"</code>.</p> <p>This is a terrible idea for one simple reason: transaction failure.</p> <p>If your database engine fails on table number 14 (perhaps due to a syntax error), what happens to the first 13 tables? Some database engines support transactional DDL (Data Definition Language), meaning it will roll back all 13 tables automatically. But many databases do not.</p> <p>If it fails halfway through, your database is now in a fractured state. Liquibase didn't record <code>id="1"</code> as successful, so the next time you start the app, it will try to create all 15 tables again. It will immediately crash because table 1 already exists.</p> <p><strong>The Fix:</strong> Stick to the rule of "one logical operation per changeSet." If you're creating three tables, write three separate changeSets. If one fails, the successful ones are permanently recorded, and you only have to fix the broken one.</p> <h3 id="heading-2-manual-database-tweaking-the-phantom-menace">2. Manual Database Tweaking (The Phantom Menace)</h3> <p>This is the most dangerous habit to break. A developer spots a missing index in production. Instead of writing a Liquibase migration, going through code review, and deploying, they log directly into the production database and run <code>CREATE INDEX</code> manually to save time.</p> <p>A week later, another developer writes a proper Liquibase migration to create that exact same index and deploys it. The application crashes on startup. Liquibase tries to execute the <code>CREATE INDEX</code> command, but the database throws an error saying the index already exists.</p> <p>When you adopt Liquibase, you must accept a fundamental rule: <strong>Liquibase is the absolute source of truth for your schema.</strong> Human hands should never touch the database structure directly.</p> <p><strong>The Fix:</strong> If someone accidentally does this, you have two options to fix the deployment pipeline. You can manually drop the index from the database so Liquibase can recreate it properly, or you can use the <code><preConditions></code> tag in Liquibase to check if the index exists before trying to create it.</p> <h3 id="heading-3-ignoring-the-from-scratch-build">3. Ignoring the "From Scratch" Build</h3> <p>When you work on a project for months, your local database accumulates a lot of history. You write migrations assuming certain tables or test data already exist.</p> <p>Then, a new developer joins the team. They pull the code, spin up an empty database, start Spring Boot, and the migrations crash halfway through.</p> <p>This happens because the migrations rely on an assumed state (like expecting a specific row to exist before creating a foreign key) rather than a guaranteed state.</p> <p><strong>The Fix:</strong> You should regularly test your migrations against a completely blank database. If you're using Docker, tear down your database container and rebuild it. If you're using a file-based H2 database like we set up earlier, simply delete the <code>./data/employeedb.mv.db</code> file from your project folder and restart Spring Boot. If the application can't boot successfully from a completely empty state, your migration history is broken.</p> <h3 id="heading-4-hardcoding-environment-details">4. Hardcoding Environment Details</h3> <p>Beginners sometimes hardcode environment-specific details directly into their XML files. For example, they might hardcode a specific schema name (schemaName="dev_schema") or grant permissions to a specific local user (GRANT ALL ON employees TO my_local_user).</p> <p>When this code goes to staging, the staging database uses a different schema name, and the deployment fails.</p> <p>The Fix: Keep your migrations abstract. Let Spring Boot handle the connection details via application.properties. If you absolutely must use dynamic values inside your Liquibase files, use property substitution. You can define variables in Liquibase and pass them in from Spring Boot during startup.</p> <h3 id="heading-5-messing-up-migration-ordering">5. Messing Up Migration Ordering</h3> <p>Liquibase executes files in the exact order they're listed in your <code>db.changelog-master.xml</code> file.</p> <p>If developer A creates the <code>departments</code> table in a branch, and developer B creates a foreign key linking to <code>departments</code> in another branch, whoever merges their code first dictates the order. If developer B's code gets included in the master file <em>before</em> developer A's code, Liquibase will try to create the foreign key before the target table exists.</p> <p><strong>The Fix:</strong> The master changelog is the ultimate chokepoint for database changes. During code reviews, always verify that the <code><include></code> statements are ordered chronologically and that dependencies make sense.</p> <h2 id="heading-liquibase-vs-flyway-vs-manual-sql-scripts">Liquibase vs Flyway vs Manual SQL Scripts</h2> <p>When you decide to implement database version control, you'll immediately face a choice. Liquibase isn't the only tool in the Java ecosystem. The three most common approaches to managing schema evolution are Liquibase, Flyway, and manual SQL scripts.</p> <p>You should understand the practical tradeoffs of each so you can choose the right tool for your specific team and project.</p> <h3 id="heading-1-manual-sql-scripts-the-baseline">1. Manual SQL Scripts (The Baseline)</h3> <p>This is the default approach for most beginners. You write a script.sql file and execute it directly against the database using a tool like DBeaver, pgAdmin, or DataGrip.</p> <ul> <li><p><strong>Strengths:</strong> There is zero setup required. You have total control over the exact syntax, and every backend developer already knows how to write SQL.</p> </li> <li><p><strong>Weaknesses:</strong> There's absolutely no execution tracking. This approach practically guarantees schema drift across environments. Deployments become stressful because they rely on humans remembering to execute the right scripts in the exact right order.</p> </li> <li><p><strong>The Verdict:</strong> Manual scripts are perfectly fine for solo weekend projects or rapid prototyping where you don't care if the database gets destroyed. But they become a massive liability the moment a second developer joins the team or a staging environment is created.</p> </li> </ul> <h3 id="heading-2-flyway-the-sql-purist">2. Flyway (The SQL Purist)</h3> <p>Flyway is the most popular alternative to Liquibase. Instead of using XML or YAML abstractions, Flyway embraces raw SQL. You write pure SQL files with a strict naming convention (for example, V1__Create_employee_table.sql).</p> <ul> <li><p><strong>Strengths:</strong> There's no new syntax to learn. If you know SQL, you already know how to use Flyway. It's incredibly fast to set up, highly opinionated, and integrates flawlessly with Spring Boot.</p> </li> <li><p><strong>Weaknesses:</strong> Because you write raw SQL, your migrations are intimately tied to your specific database dialect. If you write Flyway scripts for MySQL and later decide to migrate the project to PostgreSQL, you have to manually rewrite your migration history. Furthermore, seamless automated rollbacks are a paid feature in Flyway's commercial tier.</p> </li> <li><p><strong>The Verdict:</strong> Flyway is excellent for teams that are highly skilled in SQL, are permanently committed to a single database vendor, and prefer strict conventions over flexible configurations.</p> </li> </ul> <h3 id="heading-3-liquibase-the-abstraction-layer">3. Liquibase (The Abstraction Layer)</h3> <p>As we have seen throughout this tutorial, Liquibase takes a different approach by abstracting database changes into XML, YAML, or JSON.</p> <ol> <li><p><strong>Strengths:</strong> It's truly database-agnostic. You define the logical structure, and Liquibase automatically translates that into the correct SQL dialect for H2, PostgreSQL, or Oracle. It supports powerful automatic rollbacks, preconditions, contexts, and deployment labels out of the box for free.</p> </li> <li><p><strong>Weaknesses:</strong> It has a steeper learning curve than Flyway. The XML syntax is undeniably verbose and can feel heavy for very simple, single-table applications.</p> </li> <li><p><strong>The Verdict:</strong> Liquibase shines in complex applications, multi-tenant systems, projects that support multiple database vendors, and enterprise environments that require fine-grained control over CI/CD deployment pipelines.</p> </li> </ol> <h2 id="heading-liquibase-best-practices">Liquibase Best Practices</h2> <p>Now that you understand the mechanics of Liquibase, you need to know how to use it in a professional environment. Writing a migration that works on your local machine is only half the battle. Writing a migration that your entire team can safely deploy to production requires discipline.</p> <p>Here are the engineering best practices you should adopt when managing database changes.</p> <h3 id="heading-1-one-logical-change-per-changeset-the-atomic-rule">1. One Logical Change Per ChangeSet (The Atomic Rule)</h3> <p>We discussed this in the common mistakes section, but it's important enough to repeat. Never bundle a table creation, an index creation, and a data insertion into a single changeSet.</p> <p>If you're adding a salary column and an idx_employee_salary index, put them in two separate changeSets within the same file. This ensures that if the index creation fails, the column creation is still safely recorded, and you don't end up in a fractured database state.</p> <h3 id="heading-2-meaningful-file-organization-and-naming">2. Meaningful File Organization and Naming</h3> <p>Don't name your files <code>update1.xml</code> or <code>new_changes.xml</code>. Your file names should tell a story about how your database evolved.</p> <p>Adopt a strict prefix system. In our project, we used <code>01-create-employees.xml</code> and <code>02-add-employee-email.xml</code>. In a real team, you might use Jira ticket numbers or release versions (for example, <code>v1.2.0_ticket-482_add_email.xml</code>). Whatever convention you choose, enforce it rigorously during code reviews.</p> <h3 id="heading-3-treat-database-changes-like-application-code">3. Treat Database Changes Like Application Code</h3> <p>Database migrations belong in source control right next to your Java code. They should be reviewed with the exact same level of scrutiny.</p> <p>When reviewing a pull request that includes a Liquibase file, engineers should ask:</p> <ul> <li><p>Does this column need an index?</p> </li> <li><p>Is this a destructive change (like renaming a column) that will break the currently running application?</p> </li> <li><p>Did the author include explicit rollback instructions for custom SQL?</p> </li> </ul> <h3 id="heading-4-integrate-migrations-into-cicd">4. Integrate Migrations into CI/CD</h3> <p>Human hands should never run database migrations against a production server. Your deployment pipeline should handle this automatically.</p> <p>When you merge code into your main branch, your CI/CD pipeline (like GitHub Actions or GitLab CI) should build your Spring Boot application and deploy it. Because we bundled Liquibase into our Spring Boot startup sequence, the application will automatically migrate the production database before it starts accepting web traffic.</p> <p>Here's what a safe, automated deployment pipeline looks like:</p> <p>In a mature deployment pipeline, human hands never touch the production database. When you merge a pull request, the CI/CD pipeline builds the code and runs unit tests. It deploys the Spring Boot application to a staging environment, where Liquibase automatically acquires a lock and runs the migrations during startup. Once validated, that exact same artifact is promoted to production, triggering the identical automated migration process.</p> <h3 id="heading-5-never-fix-forward-by-deleting-history">5. Never Fix Forward by Deleting History</h3> <p>If a migration fails in an upper environment (like staging or production), never log into the database to delete the <code>DATABASECHANGELOG</code> row so you can try again.</p> <p>You must respect the immutability of the changelog. If you made a mistake, write a new changeSet that drops the broken table or fixes the data type, and push it through your Git workflow just like you would a Java bug fix.</p> <h2 id="heading-final-thoughts">Final Thoughts</h2> <p>Managing database schema changes doesn't have to be a source of anxiety.</p> <p>By treating your database schema as code, you eliminate the chaos of manual SQL scripts. You prevent the dreaded "schema drift" where every developer's local machine behaves differently. Most importantly, you make your deployments predictable and boring (which is exactly what you want deployments to be).</p> <p>In this tutorial, you built a practical Spring Boot application from scratch. You learned how Liquibase intercepts the application startup, locks the database, calculates cryptographic checksums, and safely applies incremental changes. You evolved a single table into a relational schema, added seed data, and learned how to avoid the most common traps beginners fall into.</p> <p>The next time you start a Spring Boot project, don't reach for a manual SQL client. Add the Liquibase dependency, create your master changelog, and start version controlling your database from day one. Your future self (and your team) will thank you.</p> </article> <article> <h1> How to Optimize Enterprise Knowledge Graphs for Scalable Digital Product Platforms </h1> <p>Kamal Kishore — Mon, 08 Jun 2026 04:18:06 +0000</p> <p>Enterprises are building more and more digital products that depend on real time intelligence. This means that being able to connect, contextualize, and reason over data has become a core capability.</p> <p>Recommendation systems, fraud detection engines, personalization platforms, and enterprise search solutions all rely on integrating data from multiple systems while preserving context and relationships.</p> <p>Enterprise Knowledge Graphs (EKGs) have emerged as a foundational architecture for addressing this challenge. By modeling enterprise data as entities and relationships, EKGs enable richer semantics, improved data discoverability, and more intelligent downstream decision making.</p> <p>While the conceptual benefits of knowledge graphs are well understood, scaling them to production grade digital platforms remains complex. Graph systems that perform well at small or medium scale often struggle under high ingestion rates, complex traversal queries, and strict latency requirements.</p> <p>This article outlines some practical, field tested strategies for optimizing enterprise knowledge graphs for real world scalability. Rather than presenting purely theoretical models, we'll focus on architectural patterns, operational lessons, and performance insights from large scale enterprise deployments.</p> <h2 id="heading-what-well-cover">What We'll Cover:</h2> <ul> <li><p><a href="#heading-prerequisites">Prerequisites</a></p> </li> <li><p><a href="#heading-why-scalability-becomes-the-core-challenge">Why Scalability Becomes the Core Challenge</a></p> </li> <li><p><a href="#heading-moving-beyond-a-single-graph-store-hybrid-architectures">Moving Beyond a Single Graph Store: Hybrid Architectures</a></p> </li> <li><p><a href="#heading-partitioning-for-scale-reducing-distributed-traversal-costs">Partitioning for Scale: Reducing Distributed Traversal Costs</a></p> </li> <li><p><a href="#heading-managing-semantic-inference-without-sacrificing-performance">Managing Semantic Inference Without Sacrificing Performance</a></p> </li> <li><p><a href="#heading-improving-query-performance-with-smarter-planning">Improving Query Performance with Smarter Planning</a></p> </li> <li><p><a href="#heading-observability-as-a-first-class-requirement">Observability as a First Class Requirement</a></p> </li> <li><p><a href="#heading-impact-on-digital-product-platforms">Impact on Digital Product Platforms</a></p> </li> <li><p><a href="#heading-conclusion">Conclusion</a></p> </li> </ul> <h2 id="heading-prerequisites"><strong>Prerequisites</strong></h2> <p>This is an architectural guide intended for data engineers, platform architects, and developers managing production-grade graph systems. To get the most out of this article, you should have the following:</p> <h3 id="heading-conceptual-knowledge"><strong>Conceptual Knowledge</strong></h3> <ul> <li><p>A solid understanding of Enterprise Knowledge Graphs (EKGs) and the fundamental differences between RDF triple stores and Labeled Property Graphs (LPGs).</p> </li> <li><p>Familiarity with distributed systems concepts, including data partitioning, semantic inference, and event-driven architectures.</p> </li> </ul> <h3 id="heading-technical-background"><strong>Technical Background</strong></h3> <ul> <li><p>Experience working with real-time data integration pipelines (such as CDC, Kafka, or Pulsar).</p> </li> <li><p>Familiarity with database observability, query execution planning, and general performance optimization techniques at scale.</p> </li> </ul> <h2 id="heading-understanding-the-enterprise-knowledge-graph-ekg">Understanding the Enterprise Knowledge Graph (EKG)</h2> <p>Before exploring how to scale these systems, it's helpful to understand exactly what a knowledge graph is and how it organizes information.</p> <p>At its core, a knowledge graph is a data model that represents real-world entities and the complex relationships between them. Unlike traditional relational databases that lock data into rigid, disconnected tables, knowledge graphs store data as a flexible, interconnected network.</p> <p>A knowledge graph is built on three fundamental components:</p> <ul> <li><p><strong>Nodes (Entities):</strong> The distinct objects, concepts, or people in your data ecosystem (for example a Customer, a Product, a Location).</p> </li> <li><p><strong>Edges (Relationships):</strong> The lines connecting the nodes that define how they interact (for example "PURCHASED," "LOCATED_IN," "MANUFACTURED_BY").</p> </li> </ul> <p><strong>Properties:</strong> The descriptive metadata attached to nodes or edges (for example, a customer's signup date, or the price of a product).</p> <h2 id="heading-our-running-example-the-global-electronics-supply-chain-graph">Our Running Example: The Global Electronics Supply Chain Graph</h2> <p>To ground these concepts, we'll use a unified example throughout this article: an enterprise graph for a global electronics manufacturer managing product data, suppliers, and manufacturing compliance.</p> <ul> <li><p>Nodes (Entities): Customer (Alice), Product (NeoPhone 15), Component (MX-200 Chip), Supplier (MaxSemi), and Region (EU).</p> </li> <li><p>Edges (Relationships): PURCHASED, PART_OF, SUPPLIES, and LOCATED_IN.</p> </li> <li><p>Properties: The NeoPhone 15 node has properties like price: 999 and sku: "NP15-01". The PURCHASED edge has a property of timestamp: 2026-06-03.</p> </li> </ul> <p>Imagine you're building the data foundation for a retail recommendation engine. To build the graph, you move through a few distinct phases:</p> <ol> <li><p><strong>Establish ontology:</strong> First, you define the blueprint – the rules dictating what kinds of entities exist and how they are allowed to interact.</p> </li> <li><p><strong>Define the nodes:</strong> You integrate data to generate specific entity nodes, such as a Customer node for "Alice," a Product node for "Noise-Canceling Headphones," and a Brand node for "TechAudio."</p> </li> <li><p><strong>Map the edges:</strong> You connect these nodes based on user actions and inventory data. Alice VIEWED the Headphones. The Headphones are MANUFACTURED_BY TechAudio.</p> </li> </ol> <p>Why does this matter? Because the data is natively structured as a relationship network, the system can rapidly execute context-rich queries.</p> <p>If you want to know what else Alice might buy, you don't need to write a heavy, expensive SQL query that joins millions of rows across five different tables. Instead, the graph simply "walks" the pathways you've already built. It traverses from Alice, across the VIEWED edge to the Headphones, across the MANUFACTURED_BY edge to TechAudio, and can instantly return other products connected to that same brand.</p> <p>By prioritizing the <em>relationships</em> between data points as much as the data points themselves, EKGs provide the contextual intelligence required for modern digital products.</p> <h2 id="heading-why-scalability-becomes-the-core-challenge"><strong>Why Scalability Becomes the Core Challenge</strong></h2> <p>Most enterprise knowledge graph initiatives begin with a limited scope, integrating a small number of datasets, enabling semantic search, or improving reporting accuracy. Early-stage deployments often succeed using a single graph database or RDF store.</p> <p>Scalability challenges emerge when EKGs become production critical infrastructure, particularly when supporting customer facing or latency-sensitive applications. At this stage, multiple pressures converge:</p> <ol> <li><p>Rapid data growth as more systems and entities are integrated</p> </li> <li><p>Continuous ingestion from streaming pipelines and transactional systems</p> </li> <li><p>Increasing query complexity, including multi hop traversals</p> </li> <li><p>Strict response time requirements, often under tens of milliseconds</p> </li> <li><p>Inference overhead introduced by ontologies and reasoning engines</p> </li> </ol> <p>Simply adding hardware or scaling nodes horizontally rarely resolves these issues. Performance degradation often results from architectural mismatches between graph workloads and system design.</p> <h2 id="heading-moving-beyond-a-single-graph-store-hybrid-architectures">Moving Beyond a Single Graph Store: Hybrid Architectures</h2> <h3 id="heading-the-limits-of-monolithic-graph-deployments">The Limits of Monolithic Graph Deployments</h3> <p>RDF triple stores offer strong semantic expressiveness and standards compliance but may struggle with high volume transactional updates or deep real time traversals. Conversely, labeled property graph (LPG) databases often provide efficient traversal performance but lack native semantic reasoning capabilities.</p> <p>Attempting to consolidate semantic modeling, inference, operational queries, and analytics into a single system frequently results in trade offs that affect performance, cost, or maintainability.</p> <h3 id="heading-a-pragmatic-hybrid-model">A Pragmatic Hybrid Model</h3> <p>A hybrid or polyglot architecture distributes responsibilities across systems optimized for specific workloads:</p> <ol> <li><p>Semantic layer (RDF / OWL): Ontology management, schema governance, reasoning workflows.</p> </li> <li><p>Operational graph layer (LPG): Real time traversals, recommendation engines, application queries.</p> </li> <li><p>Analytical stores: Aggregations, reporting, and historical analysis.</p> </li> </ol> <p>To maintain consistency between the semantic layer (RDF/OWL) and the operational graph layer (LPG), many teams implement synchronization strategies like Change Data Capture (CDC) and event driven pipelines.</p> <p>In this approach, updates in one layer are captured as events and propagated to the other layer in near real time using streaming platforms such as Kafka or Pulsar. For example, updates in the operational graph can trigger semantic updates, ensuring that ontologies and relationships remain aligned.</p> <p>Some systems also use dual write patterns or scheduled reconciliation jobs to detect and resolve inconsistencies. In practice, event-driven synchronization combined with periodic validation provides a balance between real time accuracy and system reliability.</p> <p>This separation isolates performance critical paths while preserving semantic richness where it adds value.</p> <p>In production environments, hybrid architectures consistently demonstrate improved query latency and operational flexibility compared to monolithic graph deployments, particularly for traversal-heavy workloads. Some teams have also reported latency reductions of 30–60% when separating traversal-heavy workloads into LPG layers, compared to monolithic graph deployments.</p> <p>This improvement is primarily due to reduced query complexity and optimized storage for specific access patterns.</p> <h3 id="heading-in-practice-splitting-the-supply-chain-graph">In Practice: Splitting the Supply Chain Graph</h3> <p>In a production-grade digital platform, a single database engine struggles to handle both semantic governance and high-speed operational queries on this data simultaneously.</p> <p>Here is how the hybrid model divides the labor:</p> <ul> <li><p><strong>The Semantic layer (RDF/OWL):</strong> Manages strict ontological classification and compliance rules. For example, it defines the rule: <em>“If a Component is supplied by an entity in a country under a trade embargo, the final Product inherits a 'High Risk' compliance flag.”</em></p> </li> <li><p><strong>The Operational Layer (LPG):</strong> Optimized for fast, multi-hop traversals required by customer-facing apps. When Alice views the NeoPhone 15 on a mobile app, the system queries a Labeled Property Graph (like Neo4j) using a language like Cypher to instantly traverse from the product to its components for a real-time availability check:</p> </li> </ul> <pre><code class="language-plaintext">MATCH (p:Product {id: 'NeoPhone15'})-[:HAS_COMPONENT]->(c:Component) RETURN c.name, c.stock_level </code></pre> <h2 id="heading-partitioning-for-scale-reducing-distributed-traversal-costs">Partitioning for Scale: Reducing Distributed Traversal Costs</h2> <p>As enterprise knowledge graphs outgrow single node capacity, distributed execution becomes necessary. Partitioning strategy then becomes a critical performance factor.</p> <h3 id="heading-why-default-partitioning-often-fails">Why Default Partitioning Often Fails</h3> <p>Many graph systems use hash-based or random partitioning to distribute data evenly across nodes. While this approach balances storage, it often fragments highly connected subgraphs. Even moderately complex traversals may then require excessive cross-node communication, increasing latency and reducing throughput.</p> <h3 id="heading-topology-aware-partitioning">Topology-Aware Partitioning</h3> <p>Topology-aware partitioning colocates frequently connected entities to minimize network hops during traversal. Common approaches include:</p> <ol> <li><p>Partitioning by business domain (for example, customers, products, organizations).</p> </li> <li><p>Community detection based clustering.</p> </li> <li><p>Partitioning informed by observed query patterns.</p> </li> </ol> <p>In practice, teams can achieve topology-aware partitioning by first analyzing query patterns and identifying frequently traversed relationships. Based on this analysis, related entities are co-located within the same partition to minimize cross-partition queries.</p> <p>Graph processing frameworks and database tools often provide built-in algorithms for community detection, which help group highly connected nodes. Teams can also monitor query performance over time and iteratively refine partitioning strategies to align with evolving workloads.</p> <p>By combining domain driven design with continuous performance monitoring, teams can incrementally optimize graph layouts without requiring major architectural changes.</p> <p>In production-inspired environments, topology-aware strategies significantly reduce traversal fan out and improve both median and tail latency under concurrent load.</p> <p>Though repartitioning introduces operational complexity, the performance gains justify the effort once the knowledge graph becomes central to digital product delivery.</p> <h3 id="heading-in-practice-partitioning-by-product-domain">In Practice: Partitioning by Product Domain</h3> <p>Let’s look at what happens when our supply chain graph scales across multiple database nodes.</p> <p>If we use <strong>Default Hash Partitioning</strong>, the graph is split randomly by node IDs. Alice might end up on Machine 1, the NeoPhone 15 on Machine 2, and the MX-200 Chip on Machine 3. A query tracking whether a component shortage affects Alice's order requires a slow, expensive network hop across three separate physical servers.</p> <p>Using <strong>Topology-Aware Partitioning</strong>, we can configure the cluster to use the Region or Product_Line as a partitioning key.</p> <ul> <li><strong>Partition A (Europe Hub):</strong> Co-locates Region: EU, Product: NeoPhone 15, its internal MX-200 Chip, and local customer orders.</li> </ul> <p><strong>Result:</strong> A multi-hop traversal checking component supply chains for European customers happens entirely within local memory on a single machine, reducing query latency.</p> <h2 id="heading-managing-semantic-inference-without-sacrificing-performance">Managing Semantic Inference Without Sacrificing Performance</h2> <p>Semantic inference is a defining strength of EKGs but also a frequent source of scalability challenges.</p> <h3 id="heading-the-inference-cost-problem">The Inference Cost Problem</h3> <p>Applying full ontology reasoning at query time can dramatically increase computational overhead. In some systems, inference effectively multiplies graph size, increasing memory and CPU consumption. Not all inferred relationships are equally valuable for every workload.</p> <h3 id="heading-strategies-for-selective-inference-and-materialization">Strategies for Selective Inference and Materialization</h3> <p>Scalable EKG platforms typically adopt a selective strategy:</p> <ol> <li><p>Precompute and materialize frequently accessed inferences</p> </li> <li><p>Offload complex reasoning to batch or asynchronous pipelines</p> </li> <li><p>Disable low value inference paths in latency-sensitive workloads</p> </li> </ol> <p>Hierarchical classifications and role-based relationships are often materialized ahead of time, while complex rule based reasoning is reserved for offline processing. This approach stabilizes query latency and reduces peak CPU utilization in enterprise deployments.</p> <h3 id="heading-in-practice-materializing-the-compliance-path">In Practice: Materializing the Compliance Path</h3> <p>Recall our semantic rule: <em>If a component has a supply risk, the final product inherits that risk.</em></p> <ul> <li><p><strong>The Scalability Bottleneck (Query-Time Inference):</strong> Every time an enterprise dashboard loads a product catalog of 10,000 items, the engine must recursively calculate: Product -> Has Component -> Supplied By -> Supplier Country -> Embargo List. Under high concurrent load, this calculation crashes performance.</p> </li> <li><p><strong>The Optimization (Materialization):</strong> We run an asynchronous batch job or Kafka consumer that listens for supplier updates. When a supplier's status changes, it computes the inference <em>once</em> and writes a direct property <code>is_high_risk: true</code> directly onto the Product node in the operational LPG.</p> </li> </ul> <p>Now, the customer-facing application reads a simple, static property without running an expensive multi-hop recursive inference query during runtime.</p> <h2 id="heading-improving-query-performance-with-smarter-planning">Improving Query Performance with Smarter Planning</h2> <p>As query complexity increases, query planning becomes a decisive performance lever.</p> <h3 id="heading-limitations-of-static-planning">Limitations of Static Planning</h3> <p>Traditional graph engines often rely on static heuristics or limited statistics for execution planning. In dynamic enterprise environments where data distributions evolve, these heuristics frequently produce suboptimal execution plans, leading to unpredictable performance.</p> <h3 id="heading-ml-assisted-query-optimization">ML-Assisted Query Optimization</h3> <p>Machine learning techniques are increasingly being applied to query optimization, particularly for cardinality estimation. By learning from historical query execution data, ML models can predict plan costs more accurately than rule-based systems.</p> <p>In controlled experiments and production pilots, ML-assisted planning has demonstrated substantial reductions in execution time for complex traversals, as well as improved consistency in response times.</p> <p>While implementation requires operational maturity, this represents a promising direction for large scale graph optimization.</p> <h3 id="heading-in-practice-optimizing-traversal-direction">In Practice: Optimizing Traversal Direction</h3> <p>Consider this query on our data: <em>"Find all customers who purchased a product containing the MX-200 Chip."</em></p> <p>There are two ways the graph execution planner can execute this:</p> <ol> <li><p><strong>Plan A:</strong> Start at Component: MX-200, find the products it belongs to, and then find the customers who bought those products.</p> </li> <li><p><strong>Plan B:</strong> Scan <em>all</em> Customer nodes in the database, look at their purchases, and filter for the ones containing the chip.</p> </li> </ol> <p>If the MX-200 is a rare chip used in only one niche product, <strong>Plan A</strong> is incredibly fast. If it is a generic resistor used in millions of products, <strong>Plan B</strong> or a modified hybrid plan might be more efficient.</p> <p>An ML-assisted query planner analyzes the real-time cardinality (the actual count) of the PART_OF and PURCHASED relationships in your specific database instance. It prevents the graph engine from choosing a disastrously slow traversal path when data distributions shift unexpectedly.</p> <h2 id="heading-observability-as-a-first-class-requirement">Observability as a First Class Requirement</h2> <p>Scalability can't be managed without deep observability.</p> <h3 id="heading-beyond-infrastructure-metrics">Beyond Infrastructure Metrics</h3> <p>Monitoring CPU and memory alone provides limited insight into graph-specific performance issues. Effective EKG observability includes:</p> <ol> <li><p>Query level latency metrics</p> </li> <li><p>Traversal depth and fan-out tracking</p> </li> <li><p>Inference cost monitoring</p> </li> <li><p>Partition imbalance detection</p> </li> </ol> <h3 id="heading-closing-the-optimization-loop">Closing the Optimization Loop</h3> <p>By continuously analyzing these signals, teams can iteratively refine partitioning strategies, caching policies, and materialization decisions. This feedback loop improves predictability and reduces production incidents.</p> <p>In practice, strong observability often distinguishes proactive optimization from reactive firefighting.</p> <h2 id="heading-impact-on-digital-product-platforms">Impact on Digital Product Platforms</h2> <p>When applied collectively, these optimization strategies materially enhance scalability and reliability. Across enterprise deployments, teams commonly observe:</p> <ol> <li><p>Reduced latency in real time workloads</p> </li> <li><p>Improved ingestion throughput under sustained load</p> </li> <li><p>Linear or near linear scaling as datasets grow</p> </li> <li><p>Greater stability during traffic spikes</p> </li> </ol> <p>These technical improvements translate directly into business outcomes: faster recommendations, more relevant search results, and increased confidence in deploying EKGs as mission critical infrastructure.</p> <h2 id="heading-conclusion">Conclusion</h2> <p>Enterprise knowledge graphs are no longer experimental. They're becoming the backbone of intelligent, data driven systems. As teams move toward AI-powered decision making, the role of knowledge graphs is expanding beyond storage into enabling context-aware reasoning and automation.</p> <p>An optimized EKG isn't just a database – it acts as the connective tissue between data, models, and real world applications. It provides the structured context that modern AI systems, including agentic workflows and autonomous decision engines, rely on to operate effectively.</p> <p>By adopting hybrid architectures, topology-aware partitioning, and intelligent query strategies, teams can build scalable and resilient graph systems that support both operational and analytical workloads.</p> <p>Ultimately, organizations that invest in well-designed knowledge graph infrastructure will be better positioned to power the next generation of AI systems where retrieval, reasoning, and action are seamlessly integrated.</p> </article> <article> <h1> How to Build a Browser-Based PDF Metadata Editor Using JavaScript – A Step-by-Step Guide </h1> <p>Bhavin Sheth — Sun, 07 Jun 2026 00:23:27 +0000</p> <p>PDF files contain more information than what appears on the page.</p> <p>Behind every PDF document is metadata that stores information such as the document title, author, subject, keywords, creator application, creation date, and modification date.</p> <p>Metadata helps organize documents, improve searchability, and provide useful information when files are shared between users or systems.</p> <p>In this tutorial, you'll build a browser-based PDF Metadata Editor using JavaScript.</p> <p>Users will be able to upload a PDF, preview the document, view existing metadata, update metadata fields, add custom metadata entries, and download the updated PDF directly from the browser.</p> <p>The entire process runs locally without requiring a backend server</p> <h2 id="heading-table-of-contents">Table of Contents</h2> <ol> <li><p><a href="#heading-why-pdf-metadata-is-important">Why PDF Metadata Is Important</a></p> </li> <li><p><a href="#heading-how-pdf-metadata-editing-works">How PDF Metadata Editing Works</a></p> </li> <li><p><a href="#heading-project-setup">Project Setup</a></p> </li> <li><p><a href="#heading-what-library-are-we-using">What Library Are We Using?</a></p> </li> <li><p><a href="#heading-creating-the-upload-interface">Creating the Upload Interface</a></p> </li> <li><p><a href="#heading-previewing-uploaded-pdf-files">Previewing Uploaded PDF Files</a></p> </li> <li><p><a href="#heading-reading-pdf-metadata">Reading PDF Metadata</a></p> </li> <li><p><a href="#heading-editing-pdf-metadata">Editing PDF Metadata</a></p> </li> <li><p><a href="#heading-updating-and-saving-metadata">Updating and Saving Metadata</a></p> </li> <li><p><a href="#heading-generating-the-updated-pdf">Generating the Updated PDF</a></p> </li> <li><p><a href="#heading-why-pdf-metadata-editing-is-useful">Why PDF Metadata Editing Is Useful</a></p> </li> <li><p><a href="#heading-demo-how-the-pdf-metadata-tool-works">Demo: How the PDF Metadata Tool Works</a></p> </li> <li><p><a href="#heading-important-notes-from-real-world-use">Important Notes from Real-World Use</a></p> </li> <li><p><a href="#heading-common-mistakes-to-avoid">Common Mistakes to Avoid</a></p> </li> <li><p><a href="#heading-conclusion">Conclusion</a></p> </li> </ol> <h2 id="heading-why-pdf-metadata-is-important">Why PDF Metadata Is Important</h2> <p>PDF metadata is commonly used in business documents, contracts, reports, invoices, ebooks, academic papers, legal documents, and archived files.</p> <p>When a PDF contains proper metadata, document management systems can organize files more effectively.</p> <p>Search engines, enterprise search tools, and document indexing systems can also identify documents more accurately.</p> <p>Metadata becomes especially useful when managing large collections of files because users can quickly locate documents based on title, author, subject, keywords, or custom information.</p> <p>Updating metadata also helps keep documents organized after modifications, ownership changes, or publishing updates.</p> <h2 id="heading-how-pdf-metadata-editing-works">How PDF Metadata Editing Works</h2> <p>A PDF metadata editor loads the document inside the browser and reads information stored within the PDF file properties.</p> <p>Users can review existing metadata, update values, add custom metadata fields, and save the changes into a new PDF document.</p> <p>Everything happens locally inside the browser.</p> <p>This means uploaded documents never leave the user's device, which improves privacy and security while eliminating the need for server-side processing.</p> <h2 id="heading-project-setup">Project Setup</h2> <p>This project is intentionally simple.</p> <p>You'll only need:</p> <ul> <li><p>An HTML file</p> </li> <li><p>A JavaScript file</p> </li> <li><p>A PDF processing library</p> </li> </ul> <p>No backend server or database is required. Everything runs right inside the browser.</p> <h2 id="heading-what-library-are-we-using">What Library Are We Using?</h2> <p>We'll use PDF-lib to read and update PDF metadata.</p> <p>PDF-lib provides functions for loading PDF documents, accessing metadata properties, modifying document information, and exporting updated files.</p> <p>Add the library using a CDN:</p> <pre><code class="language-html"><script src="https://unpkg.com/pdf-lib/dist/pdf-lib.min.js"></script> </code></pre> <p>Once loaded, JavaScript can access PDF metadata directly from the browser.</p> <h2 id="heading-creating-the-upload-interface">Creating the Upload Interface</h2> <p>Users first need a way to upload PDF files.</p> <p>A simple file input is enough:</p> <pre><code class="language-html"><input type="file" id="pdfInput" accept=".pdf"> </code></pre> <p>JavaScript can then detect when a PDF file is selected:</p> <pre><code class="language-javascript">const input = document.getElementById("pdfInput"); input.addEventListener("change", (event) => { const file = event.target.files[0]; console.log(file.name); }); </code></pre> <p>Here's what the upload section looks like:</p> <h2 id="heading-previewing-uploaded-pdf-files">Previewing Uploaded PDF Files</h2> <p>After uploading a PDF, users should be able to preview the document before making metadata changes.</p> <p>The browser can render PDF pages using PDF.js:</p> <pre><code class="language-javascript">const loadingTask = pdfjsLib.getDocument(url); loadingTask.promise.then((pdf) => { console.log(pdf.numPages); }); </code></pre> <p>The preview area also includes page navigation buttons so users can move between pages.</p> <p>This helps verify the correct document was uploaded before editing metadata.</p> <p>Here's what the preview section looks like:</p> <h2 id="heading-reading-pdf-metadata">Reading PDF Metadata</h2> <p>Once the PDF is loaded, metadata can be extracted from the document.</p> <p>For example:</p> <pre><code class="language-javascript">const pdfDoc = await PDFLib.PDFDocument.load(arrayBuffer); const title = pdfDoc.getTitle(); const author = pdfDoc.getAuthor(); console.log(title); console.log(author); </code></pre> <p>This information can then be displayed inside editable form fields.</p> <h2 id="heading-editing-pdf-metadata">Editing PDF Metadata</h2> <p>Users can update common document properties such as title, author, subject, keywords, creator information, and modification dates.</p> <p>Custom metadata fields can also be added when additional document information is required.</p> <p>For example:</p> <pre><code class="language-javascript">pdfDoc.setTitle("Project Report"); pdfDoc.setAuthor("John Doe"); pdfDoc.setSubject("Monthly Review"); </code></pre> <p>Here's what the metadata editor looks like:</p> <h2 id="heading-updating-and-saving-metadata">Updating and Saving Metadata</h2> <p>Once the metadata fields have been updated, JavaScript can apply the changes to the PDF document.</p> <p>For example:</p> <pre><code class="language-javascript">pdfDoc.setTitle("Updated Document"); pdfDoc.setAuthor("John Doe"); pdfDoc.setSubject("PDF Metadata Tutorial"); </code></pre> <p>Custom metadata values can also be inserted before exporting the document.</p> <p>After all changes are complete, users click the Update Metadata button to generate the modified PDF.</p> <h2 id="heading-generating-the-updated-pdf">Generating the Updated PDF</h2> <p>After updating metadata, the browser creates a new PDF document containing the revised information.</p> <p>The original document remains unchanged while the updated version is generated locally.</p> <pre><code class="language-javascript">const pdfBytes = await pdfDoc.save(); </code></pre> <p>The updated file can then be prepared for download.</p> <h2 id="heading-why-pdf-metadata-editing-is-useful">Why PDF Metadata Editing Is Useful</h2> <p>Metadata is often overlooked, but it plays an important role in document management.</p> <p>Organizations use metadata to organize thousands of PDF files across internal systems.</p> <p>When documents contain proper titles, keywords, subjects, and author information, they become easier to search, categorize, and manage.</p> <p>For example, legal teams may store contracts with custom metadata fields for clients or case numbers.</p> <p>Businesses often use metadata to organize invoices, reports, proposals, and project documents.</p> <p>Publishers frequently update document properties before distributing ebooks, manuals, and guides.</p> <p>Metadata can also improve indexing in document management systems and make archived files easier to locate months or years later.</p> <p>Updating metadata before sharing documents creates a cleaner and more professional final file while improving long-term document organization.</p> <h2 id="heading-demo-how-the-pdf-metadata-tool-works">Demo: How the PDF Metadata Tool Works</h2> <h3 id="heading-step-1-upload-a-pdf-file">Step 1: Upload a PDF File</h3> <p>Users begin by uploading a PDF document into the browser.</p> <p>The upload area supports drag-and-drop functionality as well as manual file selection.</p> <h3 id="heading-step-2-preview-the-uploaded-document">Step 2: Preview the Uploaded Document</h3> <p>After uploading the PDF, the tool displays a document preview.</p> <p>Users can navigate between pages using the left and right navigation buttons.</p> <p>This allows quick verification that the correct document has been loaded.</p> <h3 id="heading-step-3-edit-pdf-metadata">Step 3: Edit PDF Metadata</h3> <p>The metadata editor loads existing document properties automatically.</p> <p>Users can update fields such as title, author, subject, keywords, creator information, dates, and custom metadata values.</p> <p>Custom fields can be added or removed as needed.</p> <h3 id="heading-step-4-update-metadata">Step 4: Update Metadata</h3> <p>After making changes, users click the Update Metadata button.</p> <p>The browser processes the document and applies all metadata updates locally.</p> <h3 id="heading-step-5-download-the-updated-pdf">Step 5: Download the Updated PDF</h3> <p>Once processing is complete, the updated PDF becomes available for download.</p> <p>The output section displays the updated filename, total page count, file size information, and download controls as well as rename option before download.</p> <p>A Start Over button is also available for processing another document.</p> <h2 id="heading-important-notes-from-real-world-use">Important Notes from Real-World Use</h2> <p>When working with PDF metadata, it's important to validate uploaded files before processing them.</p> <p>For example:</p> <pre><code class="language-javascript">if (!file.name.endsWith(".pdf")) { alert("Please upload a PDF file"); return; } </code></pre> <p>Large PDF files may require additional processing time.</p> <p>Always verify metadata values before generating the updated document.</p> <p>Sensitive information stored inside metadata should be reviewed carefully before sharing documents publicly.</p> <h2 id="heading-common-mistakes-to-avoid">Common Mistakes to Avoid</h2> <p>One common mistake is assuming that all PDFs contain metadata. Many documents may have empty metadata fields that need to be populated manually.</p> <p>For example:</p> <pre><code class="language-javascript">const title = pdfDoc.getTitle() || "Untitled Document"; </code></pre> <p>Another mistake is forgetting to update the modification date after changing document properties.</p> <p>Always review metadata values before exporting the final file.</p> <p>Previewing the document and checking file details before download can help prevent mistakes.</p> <h2 id="heading-conclusion">Conclusion</h2> <p>In this tutorial, you built a browser-based PDF Metadata Editor using JavaScript.</p> <p>You learned how to upload PDF files, preview document pages, read existing metadata, update document properties, add custom metadata fields, and generate updated PDF files directly inside the browser.</p> <p>More importantly, you saw how modern browsers can handle PDF property management locally without requiring a backend server.</p> <p>This approach keeps document processing fast, private, and easy to use.</p> <p>If you'd like to see a working example, you can try out this free <a href="https://allinonetools.net/pdf-metadata/">PDF Metadata Tool</a> and explore how metadata can be viewed and updated directly in the browser.</p> <p>Once you understand this workflow, you can extend it further with features like PDF encryption, document signing, watermarking, page organization, annotations, and advanced PDF editing tools.</p> </article> </main></body></html>

freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

How to Build a Case Converter Tool Using HTML, CSS, and JavaScript

Prerequisites

Table of Contents

Step 1: Set Up Your Project

Step 2: Build the HTML Structure