使用 API
处理停止原因
了解如何处理Claude API响应中的stop_reason字段,包括不同停止原因的含义和最佳实践。
当您向Messages API发出请求时,Claude的响应包含一个stop_reason
字段,该字段指示模型为什么停止生成响应。理解这些值对于构建能够适当处理不同响应类型的健壮应用程序至关重要。
有关API响应中stop_reason
的详细信息,请参阅Messages API参考。
什么是stop_reason?
stop_reason
字段是每个成功的Messages API响应的一部分。与表示处理请求失败的错误不同,stop_reason
告诉您Claude为什么成功完成了其响应生成。
Example response
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
停止原因值
end_turn
最常见的停止原因。表示Claude自然地完成了其响应。
if response.stop_reason == "end_turn":
# 处理完整响应
print(response.content[0].text)
max_tokens
Claude停止是因为达到了您在请求中指定的max_tokens
限制。
# 带有有限令牌的请求
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=10,
messages=[{"role": "user", "content": "Explain quantum physics"}]
)
if response.stop_reason == "max_tokens":
# 响应被截断
print("Response was cut off at token limit")
# 考虑发出另一个请求以继续
stop_sequence
Claude遇到了您的自定义停止序列之一。
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["END", "STOP"],
messages=[{"role": "user", "content": "Generate text until you say END"}]
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")
tool_use
Claude正在调用工具并期望您执行它。
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[weather_tool],
messages=[{"role": "user", "content": "What's the weather?"}]
)
if response.stop_reason == "tool_use":
# 提取并执行工具
for content in response.content:
if content.type == "tool_use":
result = execute_tool(content.name, content.input)
# 将结果返回给Claude以获得最终响应
pause_turn
与服务器工具(如网络搜索)一起使用,当Claude需要暂停长时间运行的操作时。
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[{"type": "web_search_20250305", "name": "web_search"}],
messages=[{"role": "user", "content": "Search for latest AI news"}]
)
if response.stop_reason == "pause_turn":
# 继续对话
messages = [
{"role": "user", "content": original_query},
{"role": "assistant", "content": response.content}
]
continuation = client.messages.create(
model="claude-sonnet-4-20250514",
messages=messages,
tools=[{"type": "web_search_20250305", "name": "web_search"}]
)
refusal
Claude由于安全考虑拒绝生成响应。
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "[Unsafe request]"}]
)
if response.stop_reason == "refusal":
# Claude拒绝响应
print("Claude was unable to process this request")
# 考虑重新措辞或修改请求
处理停止原因的最佳实践
1. 始终检查stop_reason
养成在响应处理逻辑中检查stop_reason
的习惯:
def handle_response(response):
if response.stop_reason == "tool_use":
return handle_tool_use(response)
elif response.stop_reason == "max_tokens":
return handle_truncation(response)
elif response.stop_reason == "pause_turn":
return handle_pause(response)
elif response.stop_reason == "refusal":
return handle_refusal(response)
else:
# 处理end_turn和其他情况
return response.content[0].text
2. 优雅地处理max_tokens
当响应由于令牌限制而被截断时:
def handle_truncated_response(response):
if response.stop_reason == "max_tokens":
# 选项1:警告用户
return f"{response.content[0].text}\n\n[Response truncated due to length]"
# 选项2:继续生成
messages = [
{"role": "user", "content": original_prompt},
{"role": "assistant", "content": response.content[0].text}
]
continuation = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages + [{"role": "user", "content": "Please continue"}]
)
return response.content[0].text + continuation.content[0].text
3. 为pause_turn实现重试逻辑
对于可能暂停的服务器工具:
def handle_paused_conversation(initial_response, max_retries=3):
response = initial_response
messages = [{"role": "user", "content": original_query}]
for attempt in range(max_retries):
if response.stop_reason != "pause_turn":
break
messages.append({"role": "assistant", "content": response.content})
response = client.messages.create(
model="claude-sonnet-4-20250514",
messages=messages,
tools=original_tools
)
return response
停止原因与错误
区分stop_reason
值和实际错误很重要:
停止原因(成功响应)
- 响应体的一部分
- 指示生成正常停止的原因
- 响应包含有效内容
错误(失败请求)
- HTTP状态码4xx或5xx
- 指示请求处理失败
- 响应包含错误详细信息
try:
response = client.messages.create(...)
# 使用stop_reason处理成功响应
if response.stop_reason == "max_tokens":
print("Response was truncated")
except anthropic.APIError as e:
# 处理实际错误
if e.status_code == 429:
print("Rate limit exceeded")
elif e.status_code == 500:
print("Server error")
流式传输考虑事项
使用流式传输时,stop_reason
是:
- 在初始
message_start
事件中为null
- 在
message_delta
事件中提供 - 在任何其他事件中不提供
with client.messages.stream(...) as stream:
for event in stream:
if event.type == "message_delta":
stop_reason = event.delta.stop_reason
if stop_reason:
print(f"Stream ended with: {stop_reason}")
常见模式
处理工具使用工作流
def complete_tool_workflow(client, user_query, tools):
messages = [{"role": "user", "content": user_query}]
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
messages=messages,
tools=tools
)
if response.stop_reason == "tool_use":
# 执行工具并继续
tool_results = execute_tools(response.content)
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
else:
# 最终响应
return response
确保完整响应
def get_complete_response(client, prompt, max_attempts=3):
messages = [{"role": "user", "content": prompt}]
full_response = ""
for _ in range(max_attempts):
response = client.messages.create(
model="claude-sonnet-4-20250514",
messages=messages,
max_tokens=4096
)
full_response += response.content[0].text
if response.stop_reason != "max_tokens":
break
# 从停止的地方继续
messages = [
{"role": "user", "content": prompt},
{"role": "assistant", "content": full_response},
{"role": "user", "content": "Please continue from where you left off."}
]
return full_response
通过正确处理stop_reason
值,您可以构建更健壮的应用程序,优雅地处理不同的响应场景并提供更好的用户体验。