Anthropic 官方指南：怎么給 Agent 設(shè)計工具

2026-04-11 12:19:38　來源: 賽博禪心

北京舉報

分享至

BLOG

本文翻譯自 Anthropic 官方博客「Seeing like an agent: how we design tools in Claude Code」，作者 Thariq Shihipar，Claude Code 團(tuán)隊工程師，今天發(fā)布

以下為逐段中英對照翻譯

構(gòu)建 Agent 最難的部分之一：設(shè)計工具

One of the hardest parts about building an agent harness is constructing its tools.

構(gòu)建 Agent harness 最困難的部分之一，是設(shè)計它的工具集

Claude acts completely through tool calling, but there are a number of ways tools can be constructed in the Claude API with primitives like bash, skills and code execution.

Claude 完全通過工具調(diào)用來行動。在 Claude API 中，工具可以用 bash、skills、代碼執(zhí)行等基礎(chǔ)原語來構(gòu)建

So how do you design your agents' tools? Do you give it one general-purpose tool like bash or code execution? Or fifty specialized tools, one for each use case?

那你該怎么給 Agent 設(shè)計工具？給它一個通用工具（比如 bash 或代碼執(zhí)行）就夠了？還是做五十個專用工具，每個場景一個？

To put yourself in the mind of the model, imagine being given a difficult math problem. What tools would you want in order to solve it? It would depend on your own skill set!

要站在模型的角度想這個問題，可以想象你面前有一道很難的數(shù)學(xué)題。你想要什么工具來解決它？答案取決于你自己的能力

Paper would be the minimum, but you'd be limited by manual calculations. A calculator would be better, but you would need to know how to operate the more advanced options. The fastest and most powerful option would be a computer, but you would have to know how to use it to write and execute code.

一張紙是最低配，但你只能手算。計算器好一些，但你得知道怎么用高級功能。最快最強(qiáng)的選擇是電腦，但你得會用它來寫和執(zhí)行代碼

This is a useful framework for designing your agent. You want to give it tools that are shaped to its own abilities. But how do you know what those abilities are? You pay attention, read its outputs, experiment. You learn to see like an agent.

這是一個很有用的設(shè)計框架。你要給 Agent 的工具，應(yīng)該貼合它自身的能力形狀。但你怎么知道它的能力是什么？你觀察它，讀它的輸出，反復(fù)實驗。你學(xué)會「像 Agent 一樣看」

If you're building an agent, you'll face the same questions we did: when to add a tool, when to remove one, and how to tell the difference. Here's how we've answered them while building Claude Code, including where we got it wrong first.

如果你在做 Agent，你會面對和我們一樣的問題：什么時候加工具，什么時候刪工具，怎么區(qū)分這兩種情況。下面是我們在 Claude Code 的實際經(jīng)驗，包括一開始做錯的地方

用 AskUserQuestion 工具改善提問能力

三種方案的光譜：從無結(jié)構(gòu)到過度剛性，AskUserQuestion 工具落在中間

When building the AskUserQuestion tool, our goal was to improve Claude's ability to ask questions (often called elicitation).

設(shè)計 AskUserQuestion 工具時，我們的目標(biāo)是提升 Claude 向用戶提問的能力（通常稱為 elicitation）

While Claude could just ask questions in plain text, we found answering those questions felt like they took an unnecessary amount of time. How could we lower this friction and increase the bandwidth of communication between the user and Claude?

雖然 Claude 可以用純文本提問，但我們發(fā)現(xiàn)回答這些問題的體驗很差，耗時太多。怎么降低這個摩擦，提升用戶和 Claude 之間的溝通帶寬？

第一次嘗試：修改 ExitPlanTool

The first approach we tried was adding a parameter to the ExitPlanTool to have an array of questions alongside the plan. This was the easiest fix to implement, but it confused Claude because we were simultaneously asking for a plan and a set of questions about the plan. What if the user's answers conflicted with what the plan said? Would Claude need to call the ExitPlanTool twice? We knew this tactic wouldn't work, so we went back to the drawing board.

我們第一個方案是給 ExitPlanTool 加一個參數(shù)，讓它在輸出計劃的同時輸出一組問題。這是最省事的改法，但它讓 Claude 很困惑：我們同時要求它做計劃和對計劃提問。如果用戶的回答和計劃矛盾怎么辦？Claude 是不是得調(diào)兩次這個工具？我們知道這個方案行不通，于是回到原點

第二次嘗試：改變輸出格式

Next, we tried updating Claude's output instructions to serve a slightly modified markdown format that it could use to ask questions. For example, we could ask it to output a list of bullet point questions with alternatives in brackets. We could then parse and format that question as UI for the user.

接下來，我們嘗試修改 Claude 的輸出指令，讓它用一種特殊的 Markdown 格式來提問。比如用 bullet point 列出問題，每個問題后面用方括號給出選項。然后前端解析這個格式，渲染成 UI

Claude could usually produce this format, but not reliably. It would append extra sentences, drop options, or abandon the structure altogether. Onto the next approach.

Claude 大部分時候能生成這個格式，但不穩(wěn)定。它會在末尾多加一句話，漏掉選項，或者干脆不用這個格式。下一個方案

第三次嘗試：AskUserQuestion 工具

AskUserQuestion 工具的實際界面

Finally, we landed on creating a tool that Claude could call at any point, but it was particularly prompted to do so during plan mode. When the tool triggered we would show a modal to display the questions and block the agent's loop until the user answered.

最終方案是做一個獨立的工具，Claude 可以在任何時候調(diào)用，但在規(guī)劃模式中會被特別引導(dǎo)去使用。工具觸發(fā)后彈出一個模態(tài)框顯示問題，阻塞 Agent 循環(huán)直到用戶回答

This tool allowed us to prompt Claude for a structured output and it helped us ensure that Claude gave the user multiple options. It also gave users ways to compose this functionality, for example calling it in the Agent SDK or using referring to it in skills.

這個工具讓我們能引導(dǎo) Claude 輸出結(jié)構(gòu)化內(nèi)容，確保給用戶多個選項。它也給了用戶組合使用的空間，比如在 Agent SDK 或 Skills 中引用它

Most importantly, Claude seemed to like calling this tool and we found its outputs worked well. After all, even the best designed tool doesn't work if Claude doesn't understand how to call it.

最關(guān)鍵的一點：Claude 喜歡調(diào)用這個工具，輸出質(zhì)量也好。畢竟，再好的工具設(shè)計，如果模型不理解怎么調(diào)用，也是白搭

Is this the final form of elicitation in Claude Code? We doubt it. As Claude gets more capable, the tools that serve it have to evolve too. The next section shows a case where a tool that once helped started getting in the way.

這是 Claude Code 中 elicitation 的最終形態(tài)嗎？大概不是。隨著 Claude 能力提升，服務(wù)它的工具也必須跟著演進(jìn)。下一節(jié)會展示一個曾經(jīng)有用的工具后來開始礙事的案例

跟隨能力迭代：從 Todos 到 Tasks

從 Todos 到 Tasks：單 Agent 線性清單 → 多 Agent 協(xié)作任務(wù)圖

When we first launched Claude Code, we realized that the model needed a todo list to keep it on track. Todos could be written at the start and checked off as the model did work. To do this we gave Claude the TodoWrite tool, which would write or update Todos and display them to the user.

Claude Code 剛上線時，我們發(fā)現(xiàn)模型需要一個待辦清單來保持專注。開工前列好待辦，做完一項勾一項。我們做了 TodoWrite 工具來實現(xiàn)這個功能

But even then, we often saw Claude forgetting what it had to do. To adapt, we inserted system reminders every 5 turns that reminded Claude of its goal.

即便如此，Claude 還是經(jīng)常忘記該干什么。我們于是每隔 5 輪對話就插一條系統(tǒng)提醒

As models improved, they found To-do lists limiting. Being sent reminders of the todo list made Claude think that it had to stick to the list instead of modifying it when it realized it needed to change course. We also saw Opus 4.5 also get much better at using subagents, but how could subagents coordinate on a shared todo list?

隨著模型迭代，Todo 列表開始礙事。系統(tǒng)提醒讓 Claude 覺得必須嚴(yán)格按清單執(zhí)行，不敢中途調(diào)整方向。Opus 4.5 用子 Agent 的能力大幅提升，但多個子 Agent 怎么共享一個 Todo 列表？

Seeing this, we replaced the TodoWrite feature with the Task tool. Whereas todos are focused on keeping the model on track, tasks help agents communicate with each other. Tasks could include dependencies, share updates across subagents and the model could alter and delete them.

看到這些問題，我們把 TodoWrite 替換成了 Task 工具。Todo 的重點是讓模型保持方向，Task 的重點是讓 Agent 之間互相溝通。Task 支持依賴關(guān)系，可以跨子 Agent 共享狀態(tài)更新，模型可以隨時修改和刪除

模型能力提升之后，曾經(jīng)需要的工具可能反過來限制它

As model capabilities increase, the tools that your models once needed might now be constraining them. It's important to constantly revisit previous assumptions on what tools are needed. This is also why it's useful to stick to a small set of models to support that have a fairly similar capabilities profile.

隨著模型能力提升，你的模型曾經(jīng)需要的工具現(xiàn)在可能反過來在限制它。定期回頭審視「這些工具是否還有必要」很重要。這也是為什么建議只支持少量能力相近的模型，這樣工具設(shè)計可以聚焦

設(shè)計搜索界面

The most consequential tools we've built are the ones that let Claude find its own context.

我們做過的最有影響力的工具，是那些讓 Claude 自己尋找上下文的工具

When Claude Code was first released internally, we used RAG: a vector database would pre-index the codebase, and the harness would retrieve relevant snippets and hand them to Claude before each response. While RAG was powerful and fast, it required indexing and setup and could be fragile across a host of different environments. Most importantly, Claude was given this context instead of finding the context itself.

Claude Code 內(nèi)部版本最早用的是 RAG：向量數(shù)據(jù)庫預(yù)先索引代碼庫，每次回復(fù)前自動檢索相關(guān)片段塞給 Claude。RAG 速度快、效果好，但需要預(yù)處理，環(huán)境兼容性脆弱。最根本的問題是：上下文是被塞給 Claude 的，不是 Claude 自己找的

But if Claude could search on the web, why couldn't it also search your codebase? By giving Claude a Grep tool, we could let it search for files and build context itself.

如果 Claude 能搜網(wǎng)頁，為什么不能搜代碼庫？給 Claude 一個 Grep 工具，就能讓它自己搜文件、自己構(gòu)建上下文

As Claude gets smarter, it becomes increasingly good at building its context when given the right tools.

Claude 越聰明，給它合適的工具后它就越擅長自己構(gòu)建上下文

When we introduced Agent Skills, we formalized the idea of progressive disclosure, which allows agents to incrementally discover relevant context through exploration.

Agent Skills 上線后，我們把這個思路正式化為漸進(jìn)式披露（progressive disclosure）：讓 Agent 通過探索逐步發(fā)現(xiàn)相關(guān)上下文

Claude could now read skill files and those files could then reference other files that the model could read recursively. In fact, a common use of skills is to add more search capabilities to Claude like giving it instructions on how to use an API or query a database.

Claude 現(xiàn)在可以讀 Skill 文件，Skill 文件可以引用其他文件，模型可以遞歸地發(fā)現(xiàn)和加載上下文。一個常見的 Skill 用法就是給 Claude 增加搜索能力：告訴它怎么調(diào) API、怎么查數(shù)據(jù)庫

Over the course of a year, Claude went from not really being able to build its own context to being able to do nested search across several layers of files to find the exact context it needed.

一年時間，Claude 從幾乎不會自己構(gòu)建上下文，到能在多層文件中嵌套搜索，精確找到需要的信息

Progressive disclosure is now a common technique we use to add new functionality without adding a tool. In the next section, we explain why.

漸進(jìn)式披露現(xiàn)在是我們常用的一種技術(shù)：不加工具就能加功能。下一節(jié)解釋具體怎么做

漸進(jìn)式披露：Claude Code Guide 子 Agent

Claude Code currently has ~20 tools, and our team frequently revisits if we need all of them for Claude to be most effective. The bar to add a new tool is high, because this gives the model one more option to think about.

Claude Code 目前有大約 20 個工具，團(tuán)隊經(jīng)常審視是否每個都有必要。加新工具的門檻很高，因為每多一個工具，模型就多一個需要思考的選項

For example, we noticed that Claude did not know enough about how to use Claude Code. If you asked it how to add a MCP or what a slash command did, it would not be able to reply.

比如，我們發(fā)現(xiàn) Claude 不夠了解 Claude Code 自身的功能。你問它怎么加 MCP、某個斜杠命令是什么意思，它答不上來

We could have put all of this information in the system prompt, but given that users rarely asked about this, it would have added context rot and interfered with Claude Code's main job: writing code.

可以把這些信息全塞進(jìn) system prompt，但用戶很少問這類問題，塞進(jìn)去會造成上下文腐蝕，干擾 Claude 的主要工作（寫代碼）

Instead, we tried progressive disclosure: we gave Claude a link to its docs that it could load and search when needed. This worked, but Claude would pull large chunks of documentation into context to find an answer the user could have gotten in one sentence.

我們嘗試漸進(jìn)式披露：給 Claude 一個指向文檔的鏈接，需要時自己去查。能用，但 Claude 會把大段文檔拉進(jìn)上下文，只為回答一個一句話就能搞定的問題

So we built the Claude Code Guide — a subagent Claude calls whenever a user asks about Claude Code itself. The subagent does the doc-searching in its own context, follows detailed instructions on how to search and what to extract, and hands back only the answer. The main agent's context stays clean.

最終我們做了一個Claude Code Guide子 Agent。當(dāng)用戶問 Claude Code 自身的問題時，主 Agent 把請求轉(zhuǎn)給這個子 Agent。子 Agent 在自己的上下文里搜索文檔、提取答案，只把答案傳回來。主 Agent 的上下文保持干凈

While this isn't a perfect solution (Claude can still get confused when you ask it about how to set itself up), we were able to add things to Claude's action space without adding a new tool.

這個方案不完美（Claude 有時候還是會在自身配置問題上犯糊涂），但關(guān)鍵是：不用加新工具，就能擴(kuò)展 Agent 的能力范圍

像 Agent 一樣看，是手藝活

Designing the tools for your models is as much an art as it is a science. It depends heavily on the model you're using, the goal of the agent and the environment it's operating in.

給模型設(shè)計工具，與其說是科學(xué)，更接近手藝。它取決于你用的模型、Agent 的目標(biāo)、運行的環(huán)境

Our best advice? Experiment often, read your outputs, try new things. And most importantly, try to see like an agent.

我們最好的建議？多實驗，讀你的輸出，試新東西。最重要的是，學(xué)會像 Agent 一樣看

Experiment often, read your outputs, try new things. And most importantly, try to see like an agent.

https://claude.com/blog/seeing-like-an-agent

作者：Thariq Shihipar，Anthropic 工程師，Claude Code 團(tuán)隊

特別聲明：以上內(nèi)容(如有圖片或視頻亦包括在內(nèi))為自媒體平臺“網(wǎng)易號”用戶上傳并發(fā)布，本平臺僅提供信息存儲服務(wù)。

Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.