Core Java

Java Flight Recorder and Mission Control: Profiling Production JVMs

Diagnosing Memory Leaks and CPU Hotspots Without Losing Sleep

Performance problems are inevitable.
Whether it’s a sudden CPU spike in the middle of the night or a creeping memory leak that bloats your heap over days, diagnosing production JVM issues can feel like defusing a bomb—especially when you lack visibility into what’s going on under the hood.

Enter Java Flight Recorder (JFR) and JDK Mission Control (JMC)—two essential tools for observing, diagnosing, and fixing JVM performance issues in production environments with minimal overhead.

This article explores how to use these tools to identify memory leaks and CPU hotspots in real-world Java applications, with opinions, links, and hands-on examples to guide you through the process.

Why JFR and Mission Control?

In the past, Java developers had to rely on heavyweight profilers like YourKit or VisualVM, which are great for local development but often too intrusive for production use.
Traditional profilers can:

  • Slow down your application
  • Require special JVM arguments
  • Be difficult to use on cloud-deployed systems

Java Flight Recorder (JFR) changes that. It’s a low-overhead event recorder built directly into the JVM, specifically designed to run in production. And when combined with JDK Mission Control (JMC)—an advanced visualization and analysis tool—you can safely analyze your app’s behavior without disrupting its performance.

Think of it as Java’s equivalent of a black box flight recorder in aviation—but for your JVM.

Quick Overview: What Are JFR and JMC?

ToolPurpose
Java Flight Recorder (JFR)Captures JVM and application events (memory allocations, CPU sampling, thread states, GC, etc.) with minimal overhead (typically <1-2%).
JDK Mission Control (JMC)Provides a GUI to analyze JFR recordings. Visualizes thread dumps, flame graphs, memory usage, class loading, and more.

Useful Links:

How Does It Work?

At its core, JFR samples data at configurable intervals, recording:

  • CPU usage (stack traces, method calls, etc.)
  • Object allocations (heap analysis)
  • Garbage Collection events
  • Thread activity and lock contention
  • File and socket I/O
  • Exceptions and event logs

All of this data is stored in a compact .jfr binary file, which you can later analyze using JDK Mission Control.

Diagnosing CPU Hotspots in Production

Let’s say you’ve deployed a Java application, and users are reporting sluggish response times. Your cloud provider dashboard shows sustained high CPU usage, but you don’t know why.

Step 1: Start Java Flight Recorder

If you’re running Java 11+, JFR is already part of your JVM.
You can start a recording dynamically without restarting the app:

jcmd <pid> JFR.start name=cpu_profile duration=5m filename=cpu-profile.jfr

Replace <pid> with your Java process ID.
This will capture 5 minutes of JVM activity.

Alternatively, you can start JFR at JVM startup:

java -XX:StartFlightRecording=duration=5m,filename=cpu-profile.jfr -jar your-app.jar

Step 2: Open the Recording in JDK Mission Control

Once the .jfr file is generated, open JMC:

jmc

Use the “Flame View” to visualize CPU usage. The taller the stack frame, the more CPU time it consumed.

Example Screenshot (imagine this in JMC):

| Application.run()                 ##########
| └─ processUserRequest()           #######
   └─ calculateRecommendation()     ####
       └─ BigDecimal.add()          ###

In this case, BigDecimal.add() might indicate hot code in financial calculations that needs optimization (e.g., switching to primitive doubles if acceptable).

Diagnosing Memory Leaks with JFR

Memory leaks are trickier. Java has garbage collection, but leaks still happen—often due to:

  • Static references
  • Poor cache management
  • Unclosed resources
  • Listener leaks

Step 1: Record Object Allocations

Start a recording that focuses on allocations:

jcmd <pid> JFR.start name=mem_leak_check settings=profile filename=memory-profile.jfr

Alternatively, use the “profile” template, which captures more detailed allocation data.

Step 2: Analyze in JMC

In Mission Control:

  • Go to “Memory” → “Object Allocation”
  • Sort by allocation pressure
  • Identify classes that are allocating unexpectedly large amounts of memory

Example:

Suppose you find that com.example.ImageCache is responsible for 70% of heap allocations, but you intended to cache only 10 images.
This suggests a leaking or unbounded cache, potentially fixed by adding WeakReference usage or limiting cache size with something like Caffeine.

Real-World Opinions: Why Use JFR in Production?

Pros:

  • Low Overhead: Sampling-based, optimized for prod use.
  • Rich Data: Captures JVM internals and custom app events.
  • No Restart Required: Dynamic recording possible.
  • Integrates with Docker & Kubernetes: Easy to automate in modern deployments.

Cons / Caveats:

  • Learning Curve: JMC is powerful but can feel overwhelming at first.
  • Limited UI customization: You might need additional tools for specific visualizations.
  • Java 8 Requires Commercial License: Free for Java 11+ (OpenJDK).

Advanced Tip: Custom Events with JFR

Did you know you can create your own JFR events in code?

import jdk.jfr.Event;
import jdk.jfr.Label;

public class UserLoginEvent extends Event {
    @Label("Username")
    String username;

    public UserLoginEvent(String username) {
        this.username = username;
    }
}

Then trigger it:

UserLoginEvent event = new UserLoginEvent("alice");
event.commit();

This allows you to track domain-specific metrics alongside JVM events!

When Should You Use JFR & JMC?

ScenarioUse JFR & JMC?
Local Development Profiling✅ Useful, but tools like VisualVM might be easier.
Staging & Load Testing✅ Absolutely—find issues before prod.
Production Monitoring✅ Yes, especially for incident diagnostics.

Further Resources & Links

Final Thoughts

JFR and Mission Control won’t magically fix your code, but they will tell you exactly where the pain points are.
In an era of microservices and cloud-native deployments, these tools are invaluable for diagnosing tricky production issues without sacrificing uptime.

If you’ve ever found yourself staring at a spiking CPU graph at 2 AM, wondering “What the heck is my JVM doing right now?”, JFR is the answer.

Eleftheria Drosopoulou

Eleftheria is an Experienced Business Analyst with a robust background in the computer software industry. Proficient in Computer Software Training, Digital Marketing, HTML Scripting, and Microsoft Office, they bring a wealth of technical skills to the table. Additionally, she has a love for writing articles on various tech subjects, showcasing a talent for translating complex concepts into accessible content.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Back to top button