Token counting enables you to determine the number of tokens in a message before sending it to Claude, helping you make informed decisions about your prompts and usage. With token counting, you can

  • Proactively manage rate limits and costs
  • Make smart model routing decisions
  • Optimize prompts to be a specific length

How to count message tokens

The token counting endpoint accepts the same structured list of inputs for creating a message, including support for system prompts, tools, images, and PDFs. The response contains the total number of input tokens.

The token count should be considered an estimate. In some cases, the actual number of input tokens used when creating a message may differ by a small amount.

Supported models

The token counting endpoint supports the following models:

  • Claude 3.5 Sonnet
  • Claude 3.5 Haiku
  • Claude 3 Haiku
  • Claude 3 Opus

Count tokens in basic messages

import anthropic

client = anthropic.Anthropic()

response = client.messages.count_tokens(
    model="claude-3-5-sonnet-20241022",
    system="You are a scientist",
    messages=[{
        "role": "user",
        "content": "Hello, Claude"
    }],
)

print(response.json())
JSON
{ "input_tokens": 14 }

Count tokens in messages with tools

import anthropic

client = anthropic.Anthropic()

response = client.messages.count_tokens(
    model="claude-3-5-sonnet-20241022",
    tools=[
        {
            "name": "get_weather",
            "description": "Get the current weather in a given location",
            "input_schema": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    }
                },
                "required": ["location"],
            },
        }
    ],
    messages=[{"role": "user", "content": "What's the weather like in San Francisco?"}]
)

print(response.json())
JSON
{ "input_tokens": 403 }

Count tokens in messages with images

import anthropic
import base64
import httpx

image_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
image_media_type = "image/jpeg"
image_data = base64.standard_b64encode(httpx.get(image_url).content).decode("utf-8")

client = anthropic.Anthropic()

response = client.messages.count_tokens(
    model="claude-3-5-sonnet-20241022",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image_media_type,
                        "data": image_data,
                    },
                },
                {
                    "type": "text",
                    "text": "Describe this image"
                }
            ],
        }
    ],
)
print(response.json())
JSON
{ "input_tokens": 1551 }

Count tokens in messages with PDFs

import base64
import anthropic

client = anthropic.Anthropic()

with open("document.pdf", "rb") as pdf_file:
    pdf_base64 = base64.standard_b64encode(pdf_file.read()).decode("utf-8")

response = client.messages.count_tokens(
    model="claude-3-5-sonnet-20241022",
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {
                    "type": "base64",
                    "media_type": "application/pdf",
                    "data": pdf_base64
                }
            },
            {
                "type": "text",
                "text": "Please summarize this document."
            }
        ]
    }]
)

print(response.json())
JSON
{ "input_tokens": 2188 }

The Token Count API supports PDFs with the same limitations as the Messages API.


Pricing and rate limits

Token counting is free to use but subject to requests per minute rate limits based on your usage tier. If you need higher limits, contact sales through the Anthropic Console.

Usage tierRequests per minute (RPM)
1100
22,000
34,000
48,000

Token counting and message creation have separate and independent rate limits — usage of one does not count against the limits of the other.

Was this page helpful?