Was this page helpful?

Understanding the Sentence-t5-xxl embedding model

Reviewed on 03 December 2024 • Published on 22 May 2024

Model overviewLink to this anchor

Attribute	Details
Provider	sentence-transformers
Compatible Instances	L4 (FP32)
Context size	512 tokens

Model nameLink to this anchor

sentence-transformers/sentence-t5-xxl:fp32

Compatible InstancesLink to this anchor

Instance type	Max context length
L4	512 (FP32)

Model introductionLink to this anchor

The Sentence-T5-XXL model represents a significant evolution in sentence embeddings, building on the robust foundation of the Text-To-Text Transfer Transformer (T5) architecture. Designed for performance in various language processing tasks, Sentence-T5-XXL leverages the strengths of T5’s encoder-decoder structure to generate high-dimensional vectors that encapsulate rich semantic information. This model has been meticulously tuned for tasks such as text classification, semantic similarity, and clustering, making it a useful tool in the RAG (Retrieval-Augmented Generation) framework. It excels in sentence similarity tasks, but its performance in semantic search tasks is less optimal.

Why is it useful?Link to this anchor

The Sentence-T5-XXL model is highly ranked on the MTEB leaderboard for open models under Apache-2 license:

Sentence-T5-XXL encodes text into 768-dimensional vectors, providing a detailed and nuanced representation of sentence semantics.
This model was trained on a diverse dataset of 2 billion question-answer pairs from various online communities, ensuring broad applicability and robustness.

How to use itLink to this anchor

Sending Managed Inference requestsLink to this anchor

To perform inference tasks with your Embedding model deployed at Scaleway, use the following command:

curl https://<Deployment UUID>.ifr.fr-par.scaleway.com/v1/embeddings \
  -H "Authorization: Bearer <IAM API key>" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Embeddings can represent text in a numerical format.",
    "model": "sentence-transformers/sentence-t5-xxl:fp32"
  }'

Make sure to replace <IAM API key> and <Deployment UUID> with your actual IAM API key and the Deployment UUID you are targeting.

Receiving Inference responsesLink to this anchor

Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server. Process the output data according to your application’s needs. The response will contain the output generated by the embedding model based on the input provided in the request.

Was this page helpful?