Member-only story
ChatGPT Moderation API: Input/Output Control
Using the OpenAI’s Moderation Endpoint for Responsible AI
Large Language Models (LLMs) have undoubtedly transformed the way we interact with technology. ChatGPT, among the prominent LLMs, has proven to be an invaluable tool, serving users with a vast array of information and helpful responses. However, like any technology, ChatGPT is not without its limitations.
Recent discussions have brought to light an important concern — the potential for ChatGPT to generate inappropriate or biased responses. This issue stems from its training data, which comprises the collective writings of individuals across diverse backgrounds and eras. While this diversity enriches the model’s understanding, it also brings with it the biases and prejudices prevalent in the real world.
As a result, some responses generated by ChatGPT may reflect these biases. But let’s be fair, inappropriate responses can be triggered by inappropriate user queries.
In this article, we will explore the importance of actively moderating both the model’s inputs and outputs when building LLM-powered applications. To do so, we will use the so-called OpenAI Moderation API that helps identify inappropriate content and take action accordingly.

