Integrating the OpenAI Moderation Model in Spring AI

Omozegie AziegbeSeptember 29th, 2025Last Updated: September 29th, 2025

0 258 3 minutes read

When building applications that handle user input, such as forums, chatbots, or social platforms, it is essential to protect users from unsafe or harmful content, and OpenAI’s Moderation model provides a reliable way to detect problematic categories, including hate speech, harassment, self-harm, and violence. In this article, we will demonstrate how to build a Spring Boot application that integrates OpenAI’s moderation model using Spring AI.

1. Project Setup

First, we need to set up a Spring Boot project that uses Spring AI. We’ll use Maven as the build tool, but you can easily adapt the setup to Gradle if you prefer.

		<dependency>
			<groupId>org.springframework.ai</groupId>
			<artifactId>spring-ai-starter-model-openai</artifactId>
		</dependency>

Adding this to the pom.xml configures the core dependencies of our application and enables OpenAI integration through spring-ai-openai-spring-boot-starter.

2. Application Configuration

We configure the application with an API key and define which moderation model to use.

spring.ai.openai.api-key=${OPENAI_API_KEY}
spring.ai.openai.moderation.options.model=omni-moderation-latest

This configuration sets the OpenAI API key from an environment variable (OPENAI_API_KEY) and specifies omni-moderation-latest as the moderation model.

3. Service Layer

The service layer encapsulates the moderation logic by handling calls to the moderation model, interpreting results, and providing a clear summary of detected violations.

@Service
public class ModerationService {

    private final OpenAiModerationModel moderationModel;

    public ModerationService(OpenAiModerationModel moderationModel) {
        this.moderationModel = moderationModel;
    }

    public String analyzeContent(String content) {
        ModerationPrompt prompt = new ModerationPrompt(content);
        ModerationResponse moderationResponse = moderationModel.call(prompt);

        return moderationResponse.getResult().getOutput().getResults().stream()
                .map(this::summarizeViolations)
                .collect(Collectors.joining("\n"));
    }

    private String summarizeViolations(ModerationResult result) {
        Categories categories = result.getCategories();
        List<String> violations = new ArrayList<>();

        if (categories.isLaw()) {
            violations.add("Law");
        }
        if (categories.isFinancial()) {
            violations.add("Financial");
        }
        if (categories.isPii()) {
            violations.add("Personally Identifiable Information/PII");
        }
        if (categories.isSexual()) {
            violations.add("Sexual");
        }
        if (categories.isHate()) {
            violations.add("Hate");
        }
        if (categories.isHarassment()) {
            violations.add("Harassment");
        }
        if (categories.isSelfHarm()) {
            violations.add("Self-Harm");
        }
        if (categories.isSexualMinors()) {
            violations.add("Sexual/Minors");
        }
        if (categories.isHateThreatening()) {
            violations.add("Hate/Threatening");
        }
        if (categories.isViolenceGraphic()) {
            violations.add("Violence/Graphic");
        }
        if (categories.isSelfHarmIntent()) {
            violations.add("Self-Harm/Intent");
        }
        if (categories.isSelfHarmInstructions()) {
            violations.add("Self-Harm/Instructions");
        }
        if (categories.isHarassmentThreatening()) {
            violations.add("Harassment/Threatening");
        }
        if (categories.isViolence()) {
            violations.add("Violence");
        }

        return violations.isEmpty()
                ? "No category violations detected."
                : "Violated categories: " + String.join("; ", violations);
    }
}

The ModerationService class integrates with the OpenAI Moderation Model to analyze user content for safety violations. In the analyzeContent method, the input text is first wrapped in a ModerationPrompt object, which serves as the structured request passed to the moderation API. The model processes this prompt and produces a ModerationResponse, which contains detailed results about the categories that may have been triggered by the content.

The summarizeViolations method then inspects the Categories object within each ModerationResult. For every flagged category, the method adds a descriptive label to an ArrayList. Finally, it returns either a message confirming no violations or a formatted string listing all the categories that were violated.

REST Controller

We expose a REST endpoint that receives text and responds with moderation results.

@RestController
@RequestMapping("/api/moderation")
public class ModerationController {

    private final ModerationService moderationService;

    public ModerationController(ModerationService moderationService) {
        this.moderationService = moderationService;
    }

    @PostMapping
    public ResponseEntity<String> moderate(@RequestBody String input) {
        String result = moderationService.analyzeContent(input);
        return ResponseEntity.ok(result);
    }
}

This controller defines a REST API for moderating text input. When a POST request is made to /api/moderation, the provided text is passed to the ModerationService, which analyzes it and returns detected category violations. The result is wrapped in a ResponseEntity and sent back to the client as the response.

4. Running and Testing the Application

After completing the configuration, start the application using mvn spring-boot:run, and proceed to test moderation with various inputs.

Violence

curl -X POST http://localhost:8080/api/moderation \
     -H "Content-Type: text/plain" \
     -d "I want to hurt someone badly."

Harassment and Hate

curl -X POST http://localhost:8080/api/moderation \
     -H "Content-Type: text/plain" \
     -d "You are worthless and I hate your existence."

5. Conclusion

In this article, we demonstrated how to integrate the OpenAI Moderation Model into a Spring Boot application using Spring AI. We covered setting up dependencies, configuring the application, implementing the service layer for handling moderation logic, and exposing a REST endpoint to analyze user input. With this setup, you can add a content safety layer to your Spring applications, ensuring that harmful or unsafe content is detected before being processed or displayed.