TechTrends Now - Tech News for Builders and Operators

ADK gives you four ways to orchestrate multi-agent systems - hierarchical delegation, sequential pipelines, parallel fan-out, and iterative loops. Here's how to build each pattern with Terraform provisioning the infrastructure.

In the previous posts, we deployed a single Vertex AI agent with tools. That handles focused tasks well. But complex workflows need multiple agents: one to research, one to write, one to review. Or one to handle orders while another handles payments.

ADK provides four orchestration primitives for building multi-agent systems. Unlike managed supervisor patterns, ADK gives you code-level control over how agents interact - sequential pipelines, parallel fan-out, iterative refinement loops, and LLM-driven delegation. Terraform provisions the infrastructure; Python defines the agent team. 🎯

🏗️ Four Multi-Agent Patterns

Pattern	Agent Type	How It Works	Best For
Hierarchy	`LlmAgent` with `sub_agents`	Parent delegates to children based on LLM reasoning	Dynamic routing, customer support
Pipeline	`SequentialAgent`	Agents run one after another, passing state	fetch → clean → analyze → summarize
Fan-out	`ParallelAgent`	Agents run concurrently, results gathered after	Independent research, multi-API calls
Refinement	`LoopAgent`	Agents repeat until quality threshold met	Draft → critique → revise cycles

Workflow agents (SequentialAgent, ParallelAgent, LoopAgent) are deterministic - they don't use an LLM for flow control. The execution pattern is fixed in code. LlmAgent with sub_agents uses the model to decide which child to invoke, making it dynamic but less predictable.

🔧 Pattern 1: Hierarchical Delegation (LLM-Driven)

A parent agent decides which specialist to delegate to based on the user's request:

from google.adk.agents import Agent

order_agent = Agent(
    name="order_agent",
    model=model,
    instruction="You handle order lookups, cancellations, and shipping status.",
    tools=[get_order_status, cancel_order],
)

payments_agent = Agent(
    name="payments_agent",
    model=model,
    instruction="You handle refunds, billing inquiries, and payment methods.",
    tools=[process_refund, get_billing_info],
)

supervisor = Agent(
    name="supervisor",
    model=model,
    instruction="""You are a customer support supervisor.
    Route order questions to order_agent.
    Route payment questions to payments_agent.
    For requests spanning both domains, coordinate between them.""",
    sub_agents=[order_agent, payments_agent],
)

The supervisor uses the model to analyze intent and delegate. sub_agents makes children permanent team members. The supervisor can transfer control to a child, and the child can escalate back.

Alternative: AgentTool (On-Demand Calling)

For agents you want to call explicitly like a function rather than delegating to:

from google.adk.tools import AgentTool

research_assistant = Agent(
    name="researcher",
    model=model,
    instruction="Research the given topic and return findings.",
    tools=[google_search],
)

main_agent = Agent(
    name="analyst",
    model=model,
    instruction="Use the research assistant to gather data, then analyze it.",
    tools=[AgentTool(research_assistant)],
)

Sub-agent vs AgentTool: A sub-agent is a permanent team member - the parent transfers control to it. An AgentTool is an on-demand consultant - the parent calls it like a function and gets results back without giving up control.

🔧 Pattern 2: Sequential Pipeline

Agents run in a fixed order, passing state between steps:

from google.adk.agents import LlmAgent
from google.adk.agents.sequential_agent import SequentialAgent

researcher = LlmAgent(
    name="researcher",
    model="gemini-2.5-flash",
    instruction="Research the given topic. Write findings clearly.",
    output_key="research_findings",
)

writer = LlmAgent(
    name="writer",
    model="gemini-2.5-flash",
    instruction="Based on {research_findings}, write a blog post draft.",
    output_key="draft",
)

editor = LlmAgent(
    name="editor",
    model="gemini-2.5-flash",
    instruction="Review {draft} for clarity and grammar. Output the final version.",
    output_key="final_post",
)

pipeline = SequentialAgent(
    name="content_pipeline",
    sub_agents=[researcher, writer, editor],
)

output_key is the communication mechanism. Each agent writes to a shared session state key. The next agent reads it via {key_name} in its instructions. No explicit data passing code needed.

🔧 Pattern 3: Parallel Fan-Out

Independent tasks run concurrently for speed:

from google.adk.agents import LlmAgent
from google.adk.agents.parallel_agent import ParallelAgent
from google.adk.agents.sequential_agent import SequentialAgent

energy_researcher = LlmAgent(
    name="energy_researcher",
    model="gemini-2.5-flash",
    instruction="Research recent developments in renewable energy.",
    output_key="energy_findings",
)

ev_researcher = LlmAgent(
    name="ev_researcher",
    model="gemini-2.5-flash",
    instruction="Research recent developments in electric vehicles.",
    output_key="ev_findings",
)

parallel_research = ParallelAgent(
    name="parallel_research",
    sub_agents=[energy_researcher, ev_researcher],
)

synthesizer = LlmAgent(
    name="synthesizer",
    model="gemini-2.5-flash",
    instruction="Combine {energy_findings} and {ev_findings} into a report.",
    output_key="final_report",
)

# Fan-out then gather
full_pipeline = SequentialAgent(
    name="research_pipeline",
    sub_agents=[parallel_research, synthesizer],
)

Each parallel agent must write to a unique output_key. They share session state but run in separate threads. Writing to the same key causes race conditions.

🔧 Pattern 4: Iterative Refinement Loop

A writer and critic repeat until quality is met:

from google.adk.agents import LlmAgent
from google.adk.agents.loop_agent import LoopAgent
from google.adk.agents.sequential_agent import SequentialAgent

writer = LlmAgent(
    name="writer",
    model="gemini-2.5-flash",
    instruction="Write or revise content based on {critic_feedback}. Output the draft.",
    output_key="current_draft",
)

critic = LlmAgent(
    name="critic",
    model="gemini-2.5-flash",
    instruction="""Review {current_draft} against these criteria:
    - Factual accuracy
    - Clear structure
    - Professional tone
    If all criteria are met, call the exit_loop tool.
    Otherwise, provide specific feedback for revision.""",
    output_key="critic_feedback",
    tools=["exit_loop"],  # Built-in tool to break the loop
)

refinement_loop = LoopAgent(
    name="refinement_loop",
    sub_agents=[writer, critic],
    max_iterations=3,
)

exit_loop is ADK's built-in escape hatch. The critic calls it when quality is acceptable. Without it, the loop runs until max_iterations. Always set max_iterations as a safety limit.

🔧 Terraform: Infrastructure for Multi-Agent

The infrastructure is the same as Day 9 - Terraform provisions APIs, service accounts, and config. The multi-agent logic lives entirely in Python:

# agents/config.tf

resource "local_file" "agent_config" {
  filename = "${path.module}/agent_source/config.json"
  content = jsonencode({
    project_id  = var.project_id
    location    = var.region
    models = {
      supervisor = var.supervisor_model.id
      specialist = var.specialist_model.id
    }
    agent_name  = "${var.environment}-multi-agent"
  })
}

Use different models for different roles:

# environments/prod.tfvars
supervisor_model = {
  id      = "gemini-2.5-pro"
  display = "Gemini 2.5 Pro"
}
specialist_model = {
  id      = "gemini-2.5-flash"
  display = "Gemini 2.5 Flash"
}

Pro for the supervisor that needs complex reasoning. Flash for specialists that execute focused tasks. Cost optimization without sacrificing quality.

📐 Combining Patterns

Real systems combine multiple patterns. A customer support pipeline might use all four:

# Parallel: gather order + payment data simultaneously
data_gathering = ParallelAgent(
    name="data_gathering",
    sub_agents=[order_lookup, payment_lookup],
)

# Sequential: gather data, then draft response
response_pipeline = SequentialAgent(
    name="response_pipeline",
    sub_agents=[data_gathering, response_drafter],
)

# Loop: draft and review until quality passes
quality_loop = LoopAgent(
    name="quality_check",
    sub_agents=[response_pipeline, quality_reviewer],
    max_iterations=2,
)

# Root: LLM-driven routing to the right workflow
root_agent = Agent(
    name="support_root",
    model=model,
    instruction="Route customer requests to the appropriate workflow.",
    sub_agents=[quality_loop, escalation_agent],
)

⚠️ Gotchas and Tips

State key collisions. Parallel agents sharing the same output_key create race conditions. Always use unique keys per agent.

Loop safety. Always set max_iterations on LoopAgent. Without it, a critic that never approves runs forever (or until your token budget runs out).

Model cost in loops. Each iteration of a LoopAgent consumes tokens for every sub-agent. Three iterations with two agents means six model calls. Use max_iterations conservatively and cheap models for critics.

Workflow agents are deterministic. SequentialAgent, ParallelAgent, and LoopAgent don't use an LLM for flow control. The execution order is fixed in code. Only LlmAgent with sub_agents uses the model to decide routing.

Test locally first. ADK's CLI (adk run) and dev UI let you visualize multi-agent execution flows, inspect state at each step, and debug routing decisions before deploying to Agent Engine.

⏭️ What's Next

This is Post 3 of the GCP AI Agents with Terraform series.

Post 1: Deploy First Vertex AI Agent 🤖
Post 2: Agent Tools - Connect to APIs 🔌
Post 3: Multi-Agent Systems (you are here) 🧠
Post 4: Agent + Google Search Grounding

Your single agent is now a team. Sequential pipelines, parallel fan-out, iterative refinement, and LLM-driven delegation - ADK gives you the building blocks. Terraform provisions the infrastructure. Python defines the workflow. 🧠

Found this helpful? Follow for the full AI Agents with Terraform series! 💬

Multi-Agent Systems on GCP: Workflow Patterns with ADK and Terraform 🧠

🏗️ Four Multi-Agent Patterns

🔧 Pattern 1: Hierarchical Delegation (LLM-Driven)

Alternative: AgentTool (On-Demand Calling)

🔧 Pattern 2: Sequential Pipeline

🔧 Pattern 3: Parallel Fan-Out

🔧 Pattern 4: Iterative Refinement Loop

🔧 Terraform: Infrastructure for Multi-Agent

📐 Combining Patterns

⚠️ Gotchas and Tips

⏭️ What's Next

Comments (0)

United States

Related News

Jeff Bezos Seeking $100 Billion to Buy Manufacturing Companies, 'Transform' Them With AI

Firefox Announces Built-In VPN and Other New Features - and Introduces Its New Mascot

Can Private Space Companies Replace the ISS Before 2030?

Juicier Steaks Soon? The UK Approves Testing of Gene-Edited Cow Feed

White House Unveils National AI Policy Framework To Limit State Power