Embeddings
Text embeddings are numerical representations of text that enable measuring semantic similarity. This guide introduces embeddings, their applications, and how to use embedding models for tasks like search, recommendations, and anomaly detection.
Before implementing embeddings
When selecting an embeddings provider, there are several factors you can consider depending on your needs and preferences:
- Dataset size & domain specificity: size of the model training dataset and its relevance to the domain you want to embed. Larger or more domain-specific data generally produces better in-domain embeddings
- Inference performance: embedding lookup speed and end-to-end latency. This is a particularly important consideration for large scale production deployments
- Customization: options for continued training on private data, or specialization of models for very specific domains. This can improve performance on unique vocabularies
How to get embeddings with Anthropic
Anthropic does not offer its own embedding model. One embeddings provider that has a wide variety of options and capabilities encompassing all of the above considerations is Voyage AI.
Voyage AI makes state-of-the-art embedding models and offers customized models for specific industry domains such as finance and healthcare, or bespoke fine-tuned models for individual customers.
The rest of this guide is for Voyage AI, but we encourage you to assess a variety of embeddings vendors to find the best fit for your specific use case.
Getting started with Voyage AI
To access Voyage embeddings:
- Sign up on Voyage AI’s website
- Obtain an API key
- Set the API key as an environment variable for convenience:
You can run the embeddings by either using the official voyageai Python package or HTTP requests, as described below.
Voyage Python package
The voyageai
package can be installed using the following command:
Then, you can create a client object and start using it to embed your texts:
result.embeddings
will be a list of two embedding vectors, each containing 1024 floating-point numbers.
After running the above code, the two embeddings will be printed on the screen:
When creating the embeddings, you may specify a few other arguments to the embed()
function. Here is the specification:
voyageai.Client.embed(texts : List[str], model : str, input_type : Optional[str] = None, truncation : Optional[bool] = None)
- texts (List[str]) - A list of texts as a list of strings, such as
["I like cats", "I also like dogs"]
. Currently, the maximum length of the list is 128, and total number of tokens in the list is at most 320K forvoyage-2
and 120K forvoyage-large-2
/voyage-code-2
. - model (str) - Name of the model. Recommended options:
voyage-2
,voyage-large-2
,voyage-code-2
. - input_type (str, optional, defaults to
None
) - Type of the input text. Defaults toNone
. Other options:query
,document
- When the input_type is set to
None
, the input text will be directly encoded by Voyage’s embedding model. Alternatively, when the inputs are documents or queries, the users can specifyinput_type
to bequery
ordocument
, respectively. In such cases, Voyage will prepend a special prompt to input text and send the extended inputs to the embedding model - For retrieval/search use cases, we recommend specifying this argument when encoding queries or documents to enhance retrieval quality. Embeddings generated with and without the
input_type
argument are compatible
- When the input_type is set to
- truncation (bool, optional, defaults to
None
) - Whether to truncate the input texts to fit within the context length.- If
True
, over-length input texts will be truncated to fit within the context length, before being vectorized by the embedding model - If
False
, an error will be raised if any given text exceeds the context length - If not specified (defaults to
None
), Voyage will truncate the input text before sending it to the embedding model if it slightly exceeds the context window length. If it significantly exceeds the context window length, an error will be raised
- If
Voyage HTTP API
You can also get embeddings by requesting the Voyage HTTP API. For example, you can send an HTTP request through the curl
command in a terminal:
The response you would get is a JSON object containing the embeddings and the token usage:
Voyage AI’s embedding endpoint is https://api.voyageai.com/v1/embeddings
(POST). The request header must contain the API key. The request body is a JSON object containing the following arguments:
- input (str, List[str]) - A single text string, or a list of texts as a list of strings. Currently, the maximum length of the list is 128, and total number of tokens in the list is at most 320K for
voyage-2
and 120K forvoyage-large-2
/voyage-code-2
. - model (str) - Name of the model. Recommended options:
voyage-2
,voyage-large-2
,voyage-code-2
. - input_type (str, optional, defaults to
None
) - Type of the input text. Defaults toNone
. Other options:query
,document
- truncation (bool, optional, defaults to
None
) - Whether to truncate the input texts to fit within the context length- If
True
, over-length input texts will be truncated to fit within the context length before being vectorized by the embedding model - If
False
, an error will be raised if any given text exceeds the context length - If not specified (defaults to
None
), Voyage will truncate the input text before sending it to the embedding model if it slightly exceeds the context window length. If it significantly exceeds the context window length, an error will be raised
- If
- encoding_format (str, optional, default to
None
) - Format in which the embeddings are encoded. Voyage currently supports two options:- If not specified (defaults to
None
): the embeddings are represented as lists of floating-point numbers "base64"
: the embeddings are compressed to Base64 encodings
- If not specified (defaults to
Voyage embedding example
Now that we know how to get embeddings with Voyage, let’s see it in action with a brief example.
Suppose we have a small corpus of six documents to retrieve from
We will first use Voyage to convert each of them into an embedding vector
The embeddings will allow us to do semantic search / retrieval in the vector space. We can then convert an example query,
into an embedding, and then conduct a nearest neighbor search to find the most relevant document based on the distance in the embedding space.
Note that we use input_type="document"
and input_type="query"
for embedding the document and query, respectively. More specification can be found here.
The output would be the 5th document, which is indeed the most relevant to the query:
Available Voyage models
Voyage recommends using the following embedding models:
Model | Context Length | Embedding Dimension | Description |
---|---|---|---|
voyage-large-2 | 16000 | 1536 | Voyage AI’s most powerful generalist embedding model. |
voyage-code-2 | 16000 | 1536 | Optimized for code retrieval (17% better than alternatives), and also SoTA on general-purpose corpora. See this Voyage blog post for details. |
voyage-2 | 4000 | 1024 | Base generalist embedding model optimized for both latency and quality. |
voyage-lite-02-instruct | 4000 | 1024 | Instruction-tuned for classification, clustering, and sentence textual similarity tasks, which are the only recommended use cases for this model. |
voyage-2
and voyage-large-2
are generalist embedding models, which achieve state-of-the-art performance across domains and retain high efficiency. voyage-code-2
is optimized for the code field, offering 4x the context length for more flexible usage, albeit at a relatively higher latency.
Voyage is actively developing more advanced and specialized models, and also offers fine-tuning services to customize bespoke models for individual customers. Email your Anthropic account manager or reach out to Anthropic support for further information on bespoke models.
voyage-finance-2
: coming soonvoyage-law-2
: coming soonvoyage-multilingual-2
: coming soonvoyage-healthcare-2
: coming soon
Voyage on the AWS Marketplace
Voyage embeddings are also available on AWS Marketplace. Here are the instructions for accessing Voyage on AWS:
- Subscribe to the model package
- Navigate to the model package listing page and select the model to deploy
- Click on the
Continue to subscribe
button - Carefully review the details on the
Subscribe to this software
page. If you agree with the standard End-User License Agreement (EULA), pricing, and support terms, click on “Accept Offer” - After selecting
Continue to configuration
and choosing a region, you will be presented with a Product Arn. This is the model package ARN required for creating a deployable model using Boto3- Copy the ARN that corresponds to your selected region and use it in the subsequent cell
- Deploy the model package
From here, create a JupyterLab space in Sagemaker Studio, upload Voyage’s notebook, and follow the instructions within.
FAQ
Pricing
Visit Voyage’s pricing page for the most up to date pricing details.