Spacetime AgentsSpacetime Agents
Back to Blog

Why We Build AI Agent Armies, Not AI Tools

Haven Vu, Founder & CEO of Spacetime||2 min read
Abstract black and white stipple illustration of robotic arms assembling digital structures

TL;DR

AI rollouts stall when you ship chatbots instead of systems. Design AI like a team: narrow roles, orchestration with logs, and locked-down tool access.

Most companies buy one AI tool and wonder why nothing changes. If you want AI agents to ship outcomes, treat them like a team from day one: roles, orchestration, guardrails.

The tool trap

A single chatbot is trapped.

It can draft a response, but it cannot reliably do the end-to-end job: pull the right context, take the right action, and prove it followed the rules.

  • The support bot confidently cites the wrong refund policy.
  • The “CRM assistant” updates the wrong account because two companies share a name.

This is not a model issue. It is a system design issue.

A practical blueprint for AI agents in production

When I say “AI armies,” I mean a small set of specialized agents that coordinate.

1) Roles: build an org chart, not a prompt

Start with 3 roles. Add more only when coordination is the bottleneck.

  • Intake agent: turns messy requests into a structured task
  • Research agent: retrieves evidence from approved sources and cites it
  • Executor agent: performs a narrow action through allowlisted tools

For each role, write down three constraints:

  • Inputs it is allowed to read
  • Outputs it must produce
  • Actions it is never allowed to take

This alone eliminates the worst pattern I see: one “general agent” trying to do everything.

2) Orchestration: make the work visible and replayable

An army needs a command system. You need state, retries, and logs.

A minimal flow you can copy:

  1. Trigger: ticket created, form submitted, payment failed
  2. Intake produces a typed task object
  3. Research attaches evidence with citations
  4. Executor proposes an action plan
  5. Validation checks rules and required fields
  6. Commit the change, or route to a human if confidence is low

You can build orchestration with an agent graph framework or a workflow engine.

3) Guardrails and evaluation: boring is good

If an agent can touch production, you need controls that look like software controls.

  • Allowlist tools. Five functions, not your whole cloud.
  • Least privilege. Read-only and write access are not the same.
  • Audit logs. Every tool call has inputs, outputs, and a run ID.
  • Golden test cases. Real examples that represent your business.
  • Pre-prod evals. Agents must pass before they execute high-impact actions.

What to do next

If you are starting from zero, do this in order:

  1. Pick one workflow with clear inputs and a measurable output.
  2. Define the 3 roles.
  3. Add orchestration with state and logs.
  4. Give the executor the smallest possible tool surface.
  5. Add evaluation before you scale usage.

Spacetime Studios ships these end-to-end for teams that want outcomes, not demos. Fixed price after discovery.

Sources

  1. Anthropic — Building agents with the Claude Agent SDK https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk
  2. OWASP — OWASP Top 10 for LLM Applications 2025 https://genai.owasp.org/resource/owasp-top-10-for-llm-applications-2025/
  3. NIST — AI Risk Management Framework (AI RMF 1.0) https://www.nist.gov/itl/ai-risk-management-framework
  4. LangGraph documentation https://langchain-ai.github.io/langgraph/
  5. Temporal documentation https://docs.temporal.io/

Frequently Asked Questions

I reply to all emails if you want to chat:

Get AI automation insights

No spam. Occasional dispatches on AI agents, automation, and scaling with less headcount.