Skip to main content

Orchestrator Architecture

last updated: 2026-02-27

This chapter describes a minimal, production-grade agent orchestrator for the Scintilla Locate Protocol Factory.

Design doctrine:

  1. Specs are authority. Orchestrator proposes; locate-sdd-core + humans decide.
  2. Deterministic transforms are tool-driven.
  3. Model-agnostic routing is policy-based.
  4. Every run is auditable (immutable run ledger).
  5. Principle of least privilege (secrets never go to LLMs).

Reference architecture (source draft)

Scintilla Locate — AWS Agent Orchestrator Architecture (Minimal, Production-Grade)

Version: Draft v1
Date: 2026-02-14
Scope: AI-agent orchestration layer that produces proposals and artifacts under SDD gates.
Non-goal: replace CI, replace locate-sdd-core, or create a monolithic platform.


0. Design Doctrine

  1. Specs are authority. Orchestrator proposes; locate-sdd-core + humans decide.
  2. Deterministic transforms are tool-driven. Canonicalization, hashing, Merkle, IR build are deterministic tools.
  3. Model-agnostic by default. Routing is policy-based; models are replaceable executors.
  4. Every run is auditable. Immutable run log: prompts, tool inputs/outputs, artifacts, hashes, approvals.
  5. Principle of least privilege. Repo write access is tightly scoped; secrets never go to LLMs.

1. High-Level Architecture

Last updated: 2026-03-01

AI Software Factory Orchestrator Architecture

The orchestrator coordinates Planner → Builder → Verifier work under policy control:

  • Planner produces a checkpointed plan and required artifacts.
  • Builder implements the plan on a feature branch and generates evidence artifacts.
  • Verifier runs deterministic checks (spec lint, vectors, tests) and produces a pass/fail outcome.
  • The orchestrator writes an append-only run ledger, uploads an evidence bundle, and opens a PR/MR with a human approval checklist (Spec / Plan / Risk / Release).

Agents do not merge. Humans decide finalization.


2. Core Components

2.1 Event Ingestion

  • Jira webhooks: issue created/updated, status transition, comment added, label changes.
  • GitLab/GitHub webhooks: PR/MR opened, check failures, pipeline completion, review approvals.

Implementation:

  • Amazon EventBridge bus as the canonical event backbone.
  • Lightweight Lambda normalizers that convert each vendor payload to a common event schema.

2.2 Orchestration API

Responsibilities:

  • Start runs (POST /runs)
  • Query run status (GET /runs/<built-in function id>)
  • Retrieve artifacts (GET /runs/<built-in function id>/artifacts)
  • Apply human approvals (signed decisions)

Security:

  • Cognito for humans; IAM roles for services.
  • Fine-grained permissions by run type: spec, code, ops, docs, economics.

2.3 Workflow Engine

Minimal viable choice: AWS Step Functions.

  • Durable orchestration
  • Retries/backoff
  • Clear run state machine
  • Integrates well with ECS tasks and Lambda tool calls

Workflows (examples):

  • SpecChangeRun
  • ImplementationRun
  • ConformanceUpdateRun
  • ReleasePrepRun

2.4 Agent Runtime

Prefer ECS Fargate:

  • Per-run isolation
  • No cluster ops overhead
  • Easy IAM task roles
  • Controlled egress

Each agent task:

  • Runs from a pinned container image
  • Mounts workspace (ephemeral EFS or S3 checkout)
  • Has a strict “tool contract”
  • Logs to CloudWatch + writes structured log entries to S3

2.5 Model Router

A small policy engine:

  • Chooses provider/model based on role + task + repo classification
  • Enforces:
    • Max tokens
    • Max cost
    • Disallowed data classes (secrets, private keys)
    • Approved providers per repo sensitivity

Recommended pattern:

  • A simple service (router) with rule file stored in Git:
    • routing-policy.yaml
  • Deploy as ECS service behind internal ALB.

2.6 Tool Services (Deterministic + Repo Operations)

Key rule: LLMs do not canonicalize or hash. Tools do.

Tools exposed to agent containers via local binaries or service calls:

  • locate-sdd (spec lint, build-ir, verify, watcher-check, simulate-envelope)
  • git operations wrapper (read-only vs write-scoped)
  • diff summarizer and structured change classifier
  • optional: solc/forge/hardhat, cargo test, go test

2.7 Artifact Store + Run Ledger

  • S3: artifacts (patches, docs, test vectors, IR outputs, conformance reports)
  • DynamoDB: run metadata, state, pointers, approvals
  • KMS: encryption keys; per-run data keys
  • Optional: QLDB for tamper-evidence for governance-critical runs

Artifacts are content-addressed when possible:

  • store sha256 for each artifact
  • store spec commit hash + IR hash

3. Trust & Safety Boundaries

3.1 Secrets Handling

  • Secrets stay in AWS Secrets Manager.
  • Agents receive short-lived credentials via IAM task role.
  • LLM prompts must never contain secrets.
  • Tool output that might contain secrets is redacted before model calls.

3.2 Repo Write Permissions

  • Use GitHub App / GitLab bot accounts with repo-scoped access.
  • Split permissions:
    • read-bot for analysis
    • write-bot for branch + PR/MR only (no merge)
  • Enforce: orchestrator cannot push to main.

3.3 Deterministic Gates

Orchestrator must always run:

  • locate-sdd spec lint
  • locate-sdd build-ir
  • locate-sdd verify
  • locate-sdd watcher-check (when relevant) before opening PR/MR (or attach failure report).

4. Minimal Data Model (Run Ledger)

Run record fields:

  • run_id
  • run_type (spec_change, impl_change, conformance_update, release_prep, watcher_investigation)
  • source (jira_issue_key, pr_url, mr_url)
  • repo_scope (public, proprietary)
  • inputs (links + hashes)
  • policy_version
  • model_calls[] (provider, model, purpose, pointers)
  • artifacts[] (s3_uri, sha256, type)
  • approvals[] (who/when/decision)
  • status (queued/running/awaiting_human/failed/succeeded)

5. Deployment Topology (Minimal)

  • VPC with private subnets for ECS tasks
  • NAT Gateway (or egress proxy) with allowlisted destinations
  • API Gateway public (or private)
  • EventBridge + Step Functions
  • S3 + DynamoDB + KMS

Observability:

  • CloudWatch Logs + Metrics
  • X-Ray tracing for API/workflows
  • Alarms: run failure rate, provider errors, CI mismatch rate

6. Build-vs-Buy Guidance

Build (small):

  • Event normalization
  • Run ledger
  • Routing policy engine
  • Step Functions workflows
  • Repo bot integration
  • Deterministic tool integration (locate-sdd)

Use managed:

  • Bedrock models where appropriate
  • Managed Jira/Git providers
  • CloudWatch

Avoid monolithic “agent platforms” that cannot guarantee:

  • dual GitHub+GitLab
  • tool-first determinism
  • immutable run logging
  • strict policy routing

7. Minimal MVP Plan (4–6 weeks)

Week 1–2:

  • EventBridge + webhook ingestion (Jira + GitLab)
  • Run ledger (DynamoDB) + artifacts (S3)
  • Step Functions skeleton for SpecChangeRun

Week 3–4:

  • ECS agent runner with locate-sdd
  • Repo bot integration (branch + MR creation)
  • Model router v1 (static policy file)

Week 5–6:

  • Jira comment/status updates
  • Approval gates (“awaiting human”)
  • Extend to GitHub PRs and ImplementationRun

8. Interface to locate-sdd-core

The orchestrator treats locate-sdd-core as a black box:

  • Inputs: repo checkout + conformance corpus version
  • Outputs: pass/fail + artifacts + hashes

This separation preserves discipline and prevents vendor coupling.