AI Security
Secure Tool Calling
Function Calling Security
AI Agents
LLM Security
+3 more

Secure Tool Calling for LLMs: Function Calling Risks and Runtime Controls

SCRs Team
May 7, 2026
13 min read
Share

The Security Problem Starts the Moment the Model Can Do Something

There is a clean line in AI security between systems that generate text and systems that can act. Tool calling crosses that line.

Once a model can:

  • send email
  • modify a ticket
  • search a private knowledge base
  • create a refund
  • rotate infrastructure state

the conversation layer is no longer the only security boundary that matters. The runtime around the tool invocation becomes the real control plane.


Why Tool Calling Fails in Practice

Most implementations start from the happy path. A tool is defined, the schema is valid, the model calls it correctly, and the app executes the request.

What gets missed are the harder questions:

  • should the model be allowed to call this tool at all?
  • should it be allowed to call it without confirmation?
  • does the tool enforce its own authorization?
  • are arguments validated independently from the model output?
  • is there a dry-run mode for risky actions?

If those questions are still undecided, the tool is not production ready.


A Tool Definition Is Not a Security Policy

Anthropic's agent guidance makes an important point: tool definitions and documentation need careful engineering because the model depends on them to act correctly. That is true, but it is only half the job.

Good tool docs reduce mistakes. They do not replace runtime enforcement.


Risk Tiers Help More Than Long Discussions

One practical way to design safe tool use is to classify tools by impact:

Risk TierExample ToolsDefault Control
Lowsearch docs, summarize ticket, read feature flagsallow with logging
Mediumdraft email, create issue, update internal notesallow with output review
Hightransfer money, delete data, modify permissionsrequire confirmation or human approval

If a team cannot agree on a tool's tier, that is usually a sign the tool should not be delegated yet.


Unsafe Pattern

if (toolCall.name === "deleteCustomer") {
  await deleteCustomer(toolCall.arguments.customerId);
}

This trusts:

  • the model's choice of tool
  • the model's selected arguments
  • the app's assumption that a valid schema equals safe intent

None of those is enough on its own.


Safer Pattern

function requiresApproval(toolName: string) {
  return new Set(["deleteCustomer", "refundPayment", "changeRole"]).has(toolName);
}

async function executeTool(toolName: string, args: Record<string, unknown>, actorId: string) {
  validateToolArguments(toolName, args);
  await authorizeToolUse(toolName, actorId, args);

  if (requiresApproval(toolName)) {
    return { status: "pending_approval" };
  }

  return runTool(toolName, args);
}

This is still simple, but it treats schema validation, authorization, and approval as separate decisions.


Runtime Controls That Matter

1. Tool-Specific Authorization

Do not assume the agent's top-level identity is enough. Each tool should check whether the caller may perform that action.

2. Confirmation for Destructive or External Actions

Sending data outside the organization or deleting internal state should not happen silently because the model felt confident.

3. Argument Validation Outside the Model

The model can propose an argument. The application must validate it.

4. Dry Run Paths

The ability to preview "what would happen" is one of the easiest ways to reduce tool misuse.

5. Full Audit Trails

You want to know:

  • what the user asked
  • why the model selected the tool
  • what arguments were proposed
  • what was executed
  • whether approval was required or bypassed

The Hidden Problem: Over-Broad Tools

Sometimes the issue is not the runtime. It is the tool itself.

A tool like adminAction(command: string) is almost impossible to secure because it collapses too many decisions into one primitive. The best AI tool interfaces are usually:

  • narrow
  • well documented
  • specific to one action
  • hard to use incorrectly

That is not just good developer experience. It is security design.


What to Red-Team

Try these cases:

  • prompt injection causes the model to choose a higher-risk tool
  • model submits valid JSON with unsafe values
  • tool description is ambiguous and the model selects the wrong action
  • user asks for a harmless task and the model overreaches into state-changing behavior
  • approval flows are skipped because the app treats retries or streaming events incorrectly

Tool Calling Checklist

  • define tool risk tiers before rollout
  • authorize every tool independently
  • validate arguments outside the model output
  • require approval for destructive or external actions
  • prefer narrow tools over generic admin tools
  • log tool selection, arguments, and execution outcomes
  • include dry-run or preview paths for risky actions

Sources and Further Reading

Final Takeaway

Tool calling should be treated the way mature teams treat cloud automation: every action needs a permission model, an audit trail, and a clear escalation path. The model can suggest a tool. It should never be the only thing standing between a user prompt and a high-impact action.

Advertisement