Skip to navigationSkip to main contentSkip to footerScaleway DocsAsk our AI
Ask our AI

How to use structured outputs

Structured outputs allow users to get consistent, machine-readable JSON format responses from language models. JSON, as a widely-used format, enables seamless integration with a variety of platforms and applications. Its interoperability is crucial for developers aiming to incorporate AI functionality into their current systems with minimal adjustments.

By specifying a response format when using the Chat Completions API or Responses API, you can ensure that responses are returned in a JSON structure. There are two main modes for generating JSON: Object Mode (schemaless) and Schema Mode (deterministic, structured output).

There are several ways to interact with language models:

Before you start

To complete the actions presented below, you must have:

  • A Scaleway account logged into the console
  • Owner status or IAM permissions allowing you to perform actions in the intended Organization
  • A valid API key for API authentication
  • Python 3.7+ installed on your system

Types of structured outputs

  • Structured outputs (schema mode):

    • Type {"type": "json_schema"}
    • This mode enforces a strict schema format, where the output adheres to the predefined structure.
    • Supports complex types and validation mechanisms as per the JSON schema specification, including nested schemas composition (anyOf, allOf, oneOf etc), $ref, all types, and regular expressions.
  • JSON mode (Legacy method):

    • Type: {"type": "json_object"}
    • This mode is non-deterministic and allows the model to output a JSON object without strict validation.
    • Useful for flexible outputs when you expect the model to infer a reasonable structure based on your prompt.
    • JSON mode is older and has been used by developers since early API implementations, but lacks reliability in response formats.
Note
  • All LLMs in the Scaleway library support Structured outputs and JSON mode. However, a schemaless JSON mode will produce lower quality results and is not recommended. Note that structured output is more reliably validated and more richly parsed with the Responses API.

Chat Completions API or Responses API?

Both the Chat Completions API and the Responses API are OpenAI-compatible REST APIs that can be used for generating and manipulating conversations. The Chat Completions API is focused on generating conversational responses, while the Responses API is a more general REST API for chat, structured outputs, tool use, and multimodal inputs.

The Chat Completions API was released in 2023, and is an industry standard for building AI applications, being specifically designed for handling multi-turn conversations. It is stateless, but allows users to manage conversation history by appending each new message to the ongoing conversation. Messages in the conversation can include text, images and audio extracts. The API supports function tool-calling, allowing developers to define functions that the model can choose to call. If it does so, it returns the function name and arguments, which the developer's code must execute and feed back into the conversation.

The Responses API was released in 2025, and is designed to combine the simplicity of Chat Completions with the ability to do more agentic tasks and reasoning. It supports statefulness, being able to maintain context without needing to resend the entire conversation history. It offers tool-calling by built-in tools (e.g. web or file search) that the model is able to execute itself while generating a response.

Note

Scaleway's support for the Responses API is currently at beta stage. Support of the full feature set will be incremental: currently statefulness and tools other than function calling are not supported.

Most supported Generative API models can be used with both Chat Completions and Responses API. For the gpt-oss-120b model, use of the Responses API is recommended, as it will allow you to access all of its features, especially tool-calling.

For full details on the differences between these APIs, see the official OpenAI documentation.

Code examples

Tip

Before diving into the code examples, ensure you have the necessary libraries installed:

pip install openai pydantic

The following Python examples demonstrate how to use Structured outputs to generate structured responses.

We using the base code below to send our LLM a voice note transcript to structure:

Defining the voice note and transcript

import json
from openai import OpenAI
from pydantic import BaseModel, Field

# Set your preferred model
MODEL = "llama-3.1-8b-instruct" ## or "gpt-oss-120b" for the Responses API

# Set your API key
API_KEY = "<SCW_API_KEY>"

client = OpenAI(
    base_url="https://api.scaleway.ai/v1",
    api_key=API_KEY,
)

# Define the schema for the output using Pydantic
class VoiceNote(BaseModel):
    title: str = Field(description="A title for the voice note")
    summary: str = Field(description="A short one sentence summary of the voice note.")
    actionItems: list[str] = Field(description="A list of action items from the voice note")

# Transcript to use for the output
TRANSCRIPT = ( 
    "Good evening! It's 6:30 PM, and I'm just getting home from work. I have a few things to do " 
    "before I can relax. First, I'll need to water the plants in the garden since they've been in the sun all day. " 
    "Then, I'll start preparing dinner. I think a simple pasta dish with some garlic bread should be good. " 
    "While that's cooking, I'll catch up on a couple of phone calls I missed earlier."
)

Using structured outputs with JSON schema (Pydantic)

Using Pydantic, users can define the schema as a Python class and enforce the model to return results adhering to this schema.

Tip

Structured outputs accuracy may vary between models. For instance, with Llama models, we suggest adding a description of the field looked for in response_format and in system or user messages. In our example this would mean adding a system prompt similar to:

"content": "The following is a voice message transcript. Provide the message title, summary and action items. Only answer in JSON using '{' as the first character.",

For additional optimization or troubleshooting, see Structured output (e.g., JSON) is not working correctly.

Using structured outputs with JSON schema (manual definition)

Alternatively, users can manually define the JSON schema inline when calling the model. See below an example for doing this with the Chat Completions API:

extract = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "The following is a voice message transcript. Only answer in JSON using '{' as the first character.",
        },
        {
            "role": "user",
            "content": TRANSCRIPT,
        },
    ],
    model=MODEL,
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "VoiceNote", 
            "schema": {
                "type": "object",
                "properties": {
                    "title": {"type": "string"},
                    "summary": {"type": "string"},
                    "actionItems": {
                        "type": "array",
                        "items": {"type": "string"}
                    }
                },
                "additionalProperties": False,
                "required": ["title", "summary", "actionItems"]
            }
        }
    }
)
output = json.loads(extract.choices[0].message.content)
print(json.dumps(output, indent=2))

Output example:

{
"title": "Evening Routine",
"actionItems": [
    "Water the plants",
    "Cook dinner (pasta and garlic bread)",
    "Make phone calls"
],
"summary": "Made a list of tasks to accomplish before relaxing tonight"
}
Tip

When using the OpenAI SDKs like in the examples above, you are expected to set additionalProperties to false, and to specify all your properties as required.

Using JSON mode (schemaless, Legacy method)

Important

JSON mode: It is important to explicitly ask the model to generate a JSON output either in the system prompt or user prompt. To prevent infinite generations, model providers most often encourage users to ask the model for short JSON objects. Prompt example: Only answer in JSON using '{' as the first character..

In JSON mode, you can prompt the model to output a JSON object without enforcing a strict schema. See below an example for the Chat Completions API:

extract = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "The following is a voice message transcript. Only answer in JSON using '{' as the first character.",
        },
        {
            "role": "user",
            "content": TRANSCRIPT,
        },
    ],
    model=MODEL,
    response_format={
        "type": "json_object",
    },
)
output = json.loads(extract.choices[0].message.content)
print(json.dumps(output, indent=2))

Output example:

{
"current_time": "6:30 PM",
"tasks": [
    {
    "task": "water the plants in the garden",
    "priority": "high"
    },
    {
    "task": "prepare dinner (pasta with garlic bread)",
    "priority": "high"
    },
    {
    "task": "catch up on phone calls",
    "priority": "medium"
    }
]
}

Conclusion

Using structured outputs with LLMs can significantly improve their reliability, especially to implement agentic use cases.

  • Structured outputs provide strict adherence to a predefined schema, ensuring consistency.
  • JSON mode (Legacy Method) is flexible but less predictable.

We recommend using structured outputs (json_schema) for most use cases.

Still need help?

Create a support ticket
No Results