Orchestrator Architecture
last updated: 2026-02-27
This chapter describes a minimal, production-grade agent orchestrator for the Scintilla Locate Protocol Factory.
Design doctrine:
- Specs are authority. Orchestrator proposes;
locate-sdd-core+ humans decide. - Deterministic transforms are tool-driven.
- Model-agnostic routing is policy-based.
- Every run is auditable (immutable run ledger).
- Principle of least privilege (secrets never go to LLMs).
Reference architecture (source draft)
Scintilla Locate — AWS Agent Orchestrator Architecture (Minimal, Production-Grade)
Version: Draft v1
Date: 2026-02-14
Scope: AI-agent orchestration layer that produces proposals and artifacts under SDD gates.
Non-goal: replace CI, replace locate-sdd-core, or create a monolithic platform.
0. Design Doctrine
- Specs are authority. Orchestrator proposes;
locate-sdd-core+ humans decide. - Deterministic transforms are tool-driven. Canonicalization, hashing, Merkle, IR build are deterministic tools.
- Model-agnostic by default. Routing is policy-based; models are replaceable executors.
- Every run is auditable. Immutable run log: prompts, tool inputs/outputs, artifacts, hashes, approvals.
- Principle of least privilege. Repo write access is tightly scoped; secrets never go to LLMs.
1. High-Level Architecture
Last updated: 2026-03-01
The orchestrator coordinates Planner → Builder → Verifier work under policy control:
- Planner produces a checkpointed plan and required artifacts.
- Builder implements the plan on a feature branch and generates evidence artifacts.
- Verifier runs deterministic checks (spec lint, vectors, tests) and produces a pass/fail outcome.
- The orchestrator writes an append-only run ledger, uploads an evidence bundle, and opens a PR/MR with a human approval checklist (Spec / Plan / Risk / Release).
Agents do not merge. Humans decide finalization.
2. Core Components
2.1 Event Ingestion
- Jira webhooks: issue created/updated, status transition, comment added, label changes.
- GitLab/GitHub webhooks: PR/MR opened, check failures, pipeline completion, review approvals.
Implementation:
- Amazon EventBridge bus as the canonical event backbone.
- Lightweight Lambda normalizers that convert each vendor payload to a common event schema.
2.2 Orchestration API
Responsibilities:
- Start runs (
POST /runs) - Query run status (
GET /runs/<built-in function id>) - Retrieve artifacts (
GET /runs/<built-in function id>/artifacts) - Apply human approvals (signed decisions)
Security:
- Cognito for humans; IAM roles for services.
- Fine-grained permissions by run type:
spec,code,ops,docs,economics.
2.3 Workflow Engine
Minimal viable choice: AWS Step Functions.
- Durable orchestration
- Retries/backoff
- Clear run state machine
- Integrates well with ECS tasks and Lambda tool calls
Workflows (examples):
SpecChangeRunImplementationRunConformanceUpdateRunReleasePrepRun
2.4 Agent Runtime
Prefer ECS Fargate:
- Per-run isolation
- No cluster ops overhead
- Easy IAM task roles
- Controlled egress
Each agent task:
- Runs from a pinned container image
- Mounts workspace (ephemeral EFS or S3 checkout)
- Has a strict “tool contract”
- Logs to CloudWatch + writes structured log entries to S3
2.5 Model Router
A small policy engine:
- Chooses provider/model based on role + task + repo classification
- Enforces:
- Max tokens
- Max cost
- Disallowed data classes (secrets, private keys)
- Approved providers per repo sensitivity
Recommended pattern:
- A simple service (
router) with rule file stored in Git:routing-policy.yaml
- Deploy as ECS service behind internal ALB.
2.6 Tool Services (Deterministic + Repo Operations)
Key rule: LLMs do not canonicalize or hash. Tools do.
Tools exposed to agent containers via local binaries or service calls:
locate-sdd(spec lint, build-ir, verify, watcher-check, simulate-envelope)gitoperations wrapper (read-only vs write-scoped)diffsummarizer and structured change classifier- optional:
solc/forge/hardhat,cargo test,go test
2.7 Artifact Store + Run Ledger
- S3: artifacts (patches, docs, test vectors, IR outputs, conformance reports)
- DynamoDB: run metadata, state, pointers, approvals
- KMS: encryption keys; per-run data keys
- Optional: QLDB for tamper-evidence for governance-critical runs
Artifacts are content-addressed when possible:
- store
sha256for each artifact - store spec commit hash + IR hash
3. Trust & Safety Boundaries
3.1 Secrets Handling
- Secrets stay in AWS Secrets Manager.
- Agents receive short-lived credentials via IAM task role.
- LLM prompts must never contain secrets.
- Tool output that might contain secrets is redacted before model calls.
3.2 Repo Write Permissions
- Use GitHub App / GitLab bot accounts with repo-scoped access.
- Split permissions:
read-botfor analysiswrite-botfor branch + PR/MR only (no merge)
- Enforce: orchestrator cannot push to main.
3.3 Deterministic Gates
Orchestrator must always run:
locate-sdd spec lintlocate-sdd build-irlocate-sdd verifylocate-sdd watcher-check(when relevant) before opening PR/MR (or attach failure report).
4. Minimal Data Model (Run Ledger)
Run record fields:
run_idrun_type(spec_change, impl_change, conformance_update, release_prep, watcher_investigation)source(jira_issue_key, pr_url, mr_url)repo_scope(public, proprietary)inputs(links + hashes)policy_versionmodel_calls[](provider, model, purpose, pointers)artifacts[](s3_uri, sha256, type)approvals[](who/when/decision)status(queued/running/awaiting_human/failed/succeeded)
5. Deployment Topology (Minimal)
- VPC with private subnets for ECS tasks
- NAT Gateway (or egress proxy) with allowlisted destinations
- API Gateway public (or private)
- EventBridge + Step Functions
- S3 + DynamoDB + KMS
Observability:
- CloudWatch Logs + Metrics
- X-Ray tracing for API/workflows
- Alarms: run failure rate, provider errors, CI mismatch rate
6. Build-vs-Buy Guidance
Build (small):
- Event normalization
- Run ledger
- Routing policy engine
- Step Functions workflows
- Repo bot integration
- Deterministic tool integration (
locate-sdd)
Use managed:
- Bedrock models where appropriate
- Managed Jira/Git providers
- CloudWatch
Avoid monolithic “agent platforms” that cannot guarantee:
- dual GitHub+GitLab
- tool-first determinism
- immutable run logging
- strict policy routing
7. Minimal MVP Plan (4–6 weeks)
Week 1–2:
- EventBridge + webhook ingestion (Jira + GitLab)
- Run ledger (DynamoDB) + artifacts (S3)
- Step Functions skeleton for
SpecChangeRun
Week 3–4:
- ECS agent runner with
locate-sdd - Repo bot integration (branch + MR creation)
- Model router v1 (static policy file)
Week 5–6:
- Jira comment/status updates
- Approval gates (“awaiting human”)
- Extend to GitHub PRs and
ImplementationRun
8. Interface to locate-sdd-core
The orchestrator treats locate-sdd-core as a black box:
- Inputs: repo checkout + conformance corpus version
- Outputs: pass/fail + artifacts + hashes
This separation preserves discipline and prevents vendor coupling.