Testing OpenAI Codex and Comparing It to Claude Code
OpenAI Codex doesn't have the bells and whistles that a vibe coder might desire. But as a tool for experienced developers, it shows promise.
Jun 28th, 2025 9:00am by
Photo by Ruhan Shete on Unsplash.
Starting With OpenAI Codex CLI
Let’s start up this “experimental” project — still only a GitHub page — with a very straightforward npm package.npm i -g @openai/codex
In some respects, Codex is a bit more ”down home,” as it asks you to tie your API key straight into an environment variable:
export OPENAI_API_KEY="your-api-key-here"
You can find your OpenAI keys here. It will be quite long. You can also use a settings file.
To start an interactive session, just use the command codex:
It actually has a better starting summary than its agentic competition, like Claude Code, because it immediately states that it makes suggestions and seeks approval before doing anything destructive. We can also see the model, and the working directory — it doesn’t appear to want or need a context file. Typing “Exit” will let you leave.
Updating JSON Content
My task is simply updating the contents of one JSON file compared to another. This is based on a recent issue I had with a colleague because we weren’t sharing a JSON file on git, so we suddenly got a little out of sync. A JSON file is just a set of key/value pairs. Some people refer to the key as the name field. In the example below, the file just holds city information (some text and an image) and there is an ‘id’ key that allows for direct comparison. Think of the entries as data for a website. The interesting nuance is that the “image” key is probably pointing to a real resource, so instead of updating it with a new image reference directly (that may not exist yet), I asked my colleague to create a “imageintended” key to hold image updates. Of course, navigating indistinct human descriptions is one of the challenges for LLMs. If the LLM always needs exact input, then the dreams of development democracy will evaporate as an experienced developer will always need to be on hand. JSON data isn’t explicitly designed to be compared in this fashion, so a little care must be taken. OK, now to set up out the JSON files. I created the first poorly written original_cities.json file in the working directory:
{
"cities": [
{
"id": "London",
"text": "London is the capital of the UK",
"image": "BigBen"
},
{
"id": "Berlin",
"text": "Great night club scene",
"image": "Brandonburg Gate",
"imageintended": "Reichstag"
},
{
"id": "Paris",
"text": "Held the Olympics of 2024",
"image": "EifelTower",
}
]
}
{
"cities": [
{
"id": "London",
"text": "London is the capital and largest city in Great Britain",
"image": "BigBen"
},
{
"id": "Berlin",
"text": "Great night club scene but a small population",
"image": "BrandenburgGate",
"imageintended": "Reichstag"
},
{
"id": "Paris",
"text": "Held the Olympics of 2024",
"image": "NotreDame"
},
{
"id": "Rome",
"text": "The Eternal City",
"image": "TheColleseum"
}
]
}
sed commands:
Of course, this assumes I know what the sed command will do! Finally, it produced a patch:
And then it finished:
It had indeed modified the original_cities.json file correctly:
{
"cities": [
{
"id": "London",
"text": "London is the capital and largest city in Great Britain",
"image": "BigBen"
},
{
"id": "Berlin",
"text": "Great night club scene but a small population",
"image": "Brandonburg Gate",
"imageintended": "BrandenburgGate"
},
{
"id": "Paris",
"text": "Held the Olympics of 2024",
"image": "EifelTower",
"imageintended": "NotreDame"
},
{
"id": "Rome",
"text": "The Eternal City",
"image": "TheColleseum"
}
]
}
Conclusion
Claude Code actually made a bit of a mess of this, but Codex not only does the merge well, it also provided a very good set of notes about why it did what it did. It fixed up the minor format error, but didn’t get confused by English spelling errors and understood the need to preserve the image reference. However, it clearly did understand the changes it was making — the notes make subtle reference to the purpose. During the thinking process, it asked for permission plenty of times, as we knew it would. Unlike Claude Code, it did not exactly create a transparent plan to follow. It just outputs a heap ofsed commands, but these clearly proved good enough when executed. This makes it much less controllable from a vibe coding point of view, because someone inexperienced with code needs to know what will be happening.
I wonder whether this is the prototype for a bigger product, or whether OpenAI will cede ground for the moment to Anthropic, Google and Warp — honing its experiment before coming out with its own view of the perfect agentic experience.
YOUTUBE.COM/THENEWSTACK
Tech moves fast, don't miss an episode. Subscribe to our YouTube
channel to stream all our podcasts, interviews, demos, and more.