Developer’s Guide to the Built-In Tools of OpenAI Agents SDK

We take a deeper look at the built-in developer tools in the OpenAI SDK — including web search⁠, file search⁠, and computer use.

Mar 26th, 2025 9:12am by David Eastman

Featued image for: Developer’s Guide to the Built-In Tools of OpenAI Agents SDK

Image via Unsplash+.

In my previous post, we looked at using the new OpenAI Responses API with the Agents SDK. In this post, I’ll look more closely at the built-in tools featured — and show how they can be used. I’ll keep the scripts small, as the point is to let the SDK do the heavy lifting, although (fair warning) that turned out to be only possible for two of the three tools.

Web Search

In the previous post, we set up our Python stack with OpenAI, so I’ll start with the initial example query for the first built-in tool: web search. This uses the new Responses API. After requesting a model, we can add a list of tools that we want the request to use. I add the “tool_choice” line to force the request to use web search — just placing it in the tools list simply makes it available for the request if it wishes to use it. I’ll write the script into a local file web_search.py.

from openai import OpenAI 
client = OpenAI() 

response = client.responses.create( 
    model="gpt-4o", 
    tools=[{"type": "web_search_preview"}], 
    tool_choice={"type": "web_search_preview"}, 
    input="What Kubernetes news story appeared today?" 
) 

print(response.output_text)

The script only returns the output text:

As usual, I’m using the Warp terminal, so you can see the time the request took to complete at the top. I put $10 into my OpenAI account, and I just used 4 cents:

Even by the end of this deep dive, I used no more than about 20 cents. But for more extended use on the more expensive models, you do need to keep an eye on expenditure. Now, let’s look at the structure of the response a bit more closely. I changed the last line inthe web_search.py script to focus on the output metadata:

…
print (response.output)

This produces the following response output, with the full response text curtailed for space:

[
    ResponseFunctionWebSearch(
        id='ws_67e15b3302708190a24bd568fc6cd24a0eb6c2c0a0810e13', 
        status='completed', 
        type='web_search_call'
    ), 
    ResponseOutputMessage(
        id='msg_67e15b34af00819081c0bd65e9fb92500eb6c2c0a0810e13', 
        content=[
            ResponseOutputText(
                annotations=[
                    AnnotationURLCitation(
                        end_index=314, 
                        start_index=243, 
                        title='Kubernetes > News > Page #1 - InfoQ', 
                        type='url_citation', 
                        url='https://www.infoq.com/Kubernetes/news/?utm_source=openai'
                    )
                ],
                text='As of March 24, 2025, there are..', 
                type='output_text'
            )
        ], 
        role='assistant', 
        status='completed', 
        type='message'
    )
]

We can see the list of annotations, which in this case is just the one included in the response.output_text. Now, let’s now look at another aspect of the response by again changing the last line of web_search.py:

…
print (response.tools)

Note the default values for location within the web_search:

[
    WebSearchTool(
        type='web_search_preview', 
        search_context_size='medium', 
        user_location=UserLocation(
            type='approximate', 
            city=None, 
            country='US', 
            region=None, 
            timezone=None
        )
    )
]

The assumption is that the user is based in the US. Let’s get more local news for me. I know that KubeCon will be in London shortly, so I added the following to the request:

from openai import OpenAI
client = OpenAI()

response = client.responses.create(
    model="gpt-4o",
    tool_choice={"type": "web_search_preview"},
    tools=[ {
        "type": "web_search_preview",
        "user_location":{
            "type": "approximate",
            "country": "GB",
            "city": "London",
            "region": "London",
        }
    }],
    input="What news about KubeCon appeared this week?"
)

print(response.output_text)>

We got a little back that was relevant, with an inbuilt citation: “Looking ahead, KubeCon + CloudNativeCon Europe 2025 is scheduled to take place in London from April 1-4, 2025. This event will bring together adopters and technologists from leading open-source and cloud-native communities for four days of collaboration, learning, and innovation. ([events.linuxfoundation.org](https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/?utm_source=openai))”

File Search

As I mentioned in the previous post, before using file search we need to have set up a knowledge base (i.e. a vector store) and uploaded files to it. The vector store is hosted by OpenAI, so note the limitations. I’ll store their example file, “deep_research_blog.pdf “:

import requests
from io import BytesIO
from openai import OpenAI

client = OpenAI()

def create_file(client, file_path):

    # Download the file content from the URL
    response = requests.get(file_path)
    file_content = BytesIO(response.content)
    file_name = file_path.split("/")[-1]
    file_tuple = (file_name, file_content)
    result = client.files.create(
        file=file_tuple,
        purpose="assistants"
    )
    print(result.id)
    return result.id

file_id = create_file(client, "https://cdn.openai.com/API/docs/deep_research_blog.pdf")

# Create vector store
vector_store = client.vector_stores.create(
    name="knowledge_base"
)
print(vector_store.id)

# Add file to store
client.vector_stores.files.create(
    vector_store_id=vector_store.id,
    file_id=file_id
)
print(vector_store)

This script took a little longer to write, as it involves file manipulation. We run it to check that we have at least created a valid vector store:

To check if the file is ready, we can just use the following script later on with the given VectorStore ID:

import requests
from io import BytesIO
from openai import OpenAI

client = OpenAI()

result = client.vector_stores.files.list(
    vector_store_id='vs_67e1a50090c481918ff6747405d62249'
)
print(result)

For me, this worked fine a few minutes later:

OK, so we have put a file up into the vector store. Now we can search it using the file ID (you can only access one store right now). I’ll use the example query given by OpenAI, as I did put their document in the store. They use a different model to previous examples:

from openai import OpenAI
client = OpenAI()

response = client.responses.create(
    model="gpt-4o-mini",
    input="What is deep research by OpenAI?",
    tools=[{
        "type": "file_search",
        "vector_store_ids": ["vs_67e1a50090c481918ff6747405d62249"]
    }],
    tool_choice={"type": "file_search"},
)
print(response)

The response (after about 17 seconds) was comprehensive and in a similar form to the web_search. The output message part looks like this:

ResponseOutputMessage(
    id='msg_67e2b7d5cd40819185ba3378c3e550f20d45cba81f4e1d6d', 
    content=[
        ResponseOutputText(
            annotations=[
                AnnotationFileCitation(
                    file_id='file-NM9USSrUDDQaVHD1jkMoWv', 
                    index=1715, type='file_citation', 
                    filename='deep_research_blog.pdf'
                ), 
                AnnotationFileCitation(
                    file_id='file-NM9USSrUDDQaVHD1jkMoWv', 
                    index=1715, type='file_citation', 
                    filename='deep_research_blog.pdf'
                )
            ], 
            text="Deep Research by OpenAI is a new capability..", 
            type='output_text'
        )
    ]
    role='assistant', 
    status='completed', 
    type='message'
)

The actual text is lengthy. Note that there are two identical citations, but I saw no explanation for the exact meaning.

Computer Use

The final built-in tool is computer use, which makes the AI into a kind of “hacker”: the ability to pretend to be a human user to fill in forms, etc. As I’ve mentioned before, we are approaching a world where most queries will be AI talking to APIs directly without human intervention. In a sense, this tool tries to tackle that use case. Having used website journey debugging tools like Selenium, this kind of tool is not really suitable for casual use, as it is easy for the model to break or for a system to be spoofed. As a web developer, tester and games developer today, I know that screen modelling needs a lot of specialised attention. Essentially, you send scripted keyboard clicks or text to the model, and the system sends back images of the target browser after these operations. Just imagine you are on the phone, helping a family member who is unfamiliar with the web and trying an online tax return. Neither of you quite has the full picture.

Image source: OpenAI

Unlike the other two tools, I’ll only describe the loop, because it is too detailed to fully script in this post. At the start of step 1, you need to set a role of “user” to indicate that the AI is to suggest methods to proceed. You define the size of the target screen and give it an image of this screen. The Response output for step 2 describes the suggested action (for example, where to click on the screen). OpenAI then becomes much vaguer on automating these actions — it depends on what you have available in the target environment. Once you execute these instructions (step 3), you can then take a screenshot of the outcome (step 4) and feed it back to the start of the loop (step 1) and recycle.

Conclusion

I said at the top I would only use short scripts, but the Computer Use tool requires a lot of support scripting. In reality, OpenAI probably doesn’t have the expertise to fully complete the Computer Use case, and will either buy up expertise here or hope that a third party takes up the challenge. For now, though, the web search and file tools will help get LLM responses with timely or user-specific knowledge. Now that we have looked at both OpenAI agents and their in-built tools, I can put these two together to do a more complex agent that uses the tools to give a tighter business case. I’ll tackle this in an upcoming post.

David has been a London-based professional software developer with Oracle Corp. and British Telecom, and a consultant helping teams work in a more agile fashion. He wrote a book on UI design and has been writing technical articles ever since....