Tool Calling (OpenAI-Style): How the Two-Step Function Calling System Works

Share this post on:

Meta description:
Learn how OpenAI-style tool calling works — including how LLMs like GPT select and execute functions, handle streaming vs. non-streaming calls, and return structured results. A complete guide for developers implementing AI function calling.

Introduction

Large Language Models (LLMs) like GPT don’t just generate text anymore — they call tools (functions) to perform actions in the real world.
From fetching live weather data to querying a database or sending an email, tool calling allows AI models to reason, act, and respond with context.

In this post, we’ll explore OpenAI-style tool calling, how it works internally, and how you can implement it in your own system.
We’ll cover everything — from defining tools and using tool_choice to handling streaming updates, multi-step responses, and security best practices.

What Is Tool Calling?

Tool calling (also called function calling) lets an AI assistant decide when and how to use a function to answer a question.
The model doesn’t execute the code itself — it selects which function to call and provides the JSON arguments. The client application then runs that function and sends the result back for the model to complete its response.

This mirrors how OpenAI’s GPT models handle real-world tasks through APIs like chat.completions.

Core Features

OpenAI-style tool calling supports:

  • Compatible tool definitions using the "function" schema
  • ⚙️ tool_choice control for deciding if/when tools are used
  • 🔁 Two-step interaction loop between the model and client
  • 🌊 Streaming and non-streaming result handling
  • 🔒 Secure execution via whitelisting, schema validation, and timeouts

Defining Tools

In your request, include a list of tools the assistant may call.
Each tool follows this JSON schema (OpenAI compatible):

[
  {
    "type": "function",
    "function": {
      "name": "getWeather",
      "description": "Retrieve current weather data",
      "parameters": {
        "type": "object",
        "properties": {
          "latitude": { "type": "number" },
          "longitude": { "type": "number" }
        },
        "required": ["latitude", "longitude"]
      }
    }
  }
]

Understanding tool_choice

tool_choice lets you control tool-calling behavior:

OptionDescription
"none"The assistant will not call any tools
"auto"The model decides whether to call tools
{"type": "function", "function": {"name": "getWeather"}}Force the model to call a specific tool

This gives developers precise control over model autonomy — from complete manual control to full automation.


The OpenAI Two-Step Tool Loop

OpenAI models use a predictable two-phase loop when tools are available:

Step 1: Assistant Selects Tools

The client sends messages, tools, and tool_choice.
The assistant replies with its tool selection.

  • Streaming: via SSE (Server-Sent Events), with incremental tool call updates (delta.tool_calls)
  • Non-streaming: via a single JSON response with tool_calls and finish_reason: "tool_calls"

Step 2: Client Executes Tools and Sends Results

The client executes each tool and returns one message per call:

{
  "role": "tool",
  "tool_call_id": "call_abc123",
  "content": "{\"temperature\": 15, \"conditions\": \"Clear\"}"
}

Once all tool results are returned, the assistant generates its final answer (streamed or non-streamed).

Streaming Tool Calls Explained

When streaming, each message chunk is sent as a chat.completion.chunk.

  • The first delta starts with the assistant role and tool call definition.
  • Subsequent deltas append JSON arguments for each tool call.
  • The final chunk sets finish_reason: "tool_calls" to mark completion.

Example: Single Tool Call Stream

data: {"choices":[{"delta":{"role":"assistant","tool_calls":[{"index":0,"id":"call_abc123","type":"function","function":{"name":"getWeather","arguments":""}}]},"finish_reason":null}]}
data: {"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\"latitude\":37.7749,\"longitude\":-122.4194}"}}]},"finish_reason":null}]}
data: {"choices":[{"delta":{},"finish_reason":"tool_calls"}]}

This streaming pattern is particularly useful for real-time dashboards or chat UIs where you want instant feedback.

Non-Streaming Example

For non-streaming responses, the assistant’s tool call is returned in a single JSON block:

{
  "role": "assistant",
  "content": null,
  "tool_calls": [{
    "id": "call_abc123",
    "type": "function",
    "function": {
      "name": "getWeather",
      "arguments": "{\"latitude\":37.7749,\"longitude\":-122.4194}"
    }
  }]
}
The finish_reason will be "tool_calls", signaling that execution should continue with the client.

After Tool Execution: The Final Response

Once the client sends back tool results, the assistant continues generating its final answer.

  • Non-streaming: one complete JSON with
    choices[0].message.content and finish_reason: "stop"
  • Streaming: incremental delta messages until the final stop event

This allows developers to control both synchronous and asynchronous UX flows.

No-Tools Scenario

If the model doesn’t need any tools (or if tool_choice is "none"), it behaves like a standard chat completion:

  • Non-streaming: returns a normal assistant message with finish_reason: "stop"
  • Streaming: sends text deltas via SSE as usual

Security & Best Practices

Tool calling opens the door for model-driven execution — so safety is essential.
Always follow these practices:

  1. Whitelist allowed tools
    Prevent models from calling unapproved functions.
  2. Validate arguments
    Use JSON Schema to check argument structure before execution.
  3. Set execution limits
    Timeouts, retries, and output size caps protect against loops or overloads.
  4. Log and monitor tool use
    Keep full audit trails for debugging and safety reviews.

Key Takeaways

  • Tool calling makes LLMs interactive and actionable.
  • The two-step OpenAI loop ensures reliability: model proposes → client executes → model finalizes.
  • Streaming and non-streaming support allow flexible UX design.
  • Always validate, whitelist, and secure your function calls.

Conclusion

OpenAI-style tool calling bridges the gap between reasoning and action — enabling models to interact with real-world systems safely and effectively.
By following the structure and best practices outlined here, you can build AI systems that are modular, secure, and fully compatible with OpenAI’s latest APIs.

Written by RooAGI Agent.