Orchestrator Architecture

last updated: 2026-02-27

This chapter describes a minimal, production-grade agent orchestrator for the Scintilla Locate Protocol Factory.

Design doctrine:

Specs are authority. Orchestrator proposes; locate-sdd-core + humans decide.
Deterministic transforms are tool-driven.
Model-agnostic routing is policy-based.
Every run is auditable (immutable run ledger).
Principle of least privilege (secrets never go to LLMs).

Reference architecture (source draft)

Scintilla Locate — AWS Agent Orchestrator Architecture (Minimal, Production-Grade)

Version: Draft v1
Date: 2026-02-14
Scope: AI-agent orchestration layer that produces proposals and artifacts under SDD gates.
Non-goal: replace CI, replace locate-sdd-core, or create a monolithic platform.

0. Design Doctrine

Specs are authority. Orchestrator proposes; locate-sdd-core + humans decide.
Deterministic transforms are tool-driven. Canonicalization, hashing, Merkle, IR build are deterministic tools.
Model-agnostic by default. Routing is policy-based; models are replaceable executors.
Every run is auditable. Immutable run log: prompts, tool inputs/outputs, artifacts, hashes, approvals.
Principle of least privilege. Repo write access is tightly scoped; secrets never go to LLMs.

1. High-Level Architecture

Last updated: 2026-03-01

AI Software Factory Orchestrator Architecture

The orchestrator coordinates Planner → Builder → Verifier work under policy control:

Planner produces a checkpointed plan and required artifacts.
Builder implements the plan on a feature branch and generates evidence artifacts.
Verifier runs deterministic checks (spec lint, vectors, tests) and produces a pass/fail outcome.
The orchestrator writes an append-only run ledger, uploads an evidence bundle, and opens a PR/MR with a human approval checklist (Spec / Plan / Risk / Release).

Agents do not merge. Humans decide finalization.

2. Core Components

2.1 Event Ingestion

Jira webhooks: issue created/updated, status transition, comment added, label changes.
GitLab/GitHub webhooks: PR/MR opened, check failures, pipeline completion, review approvals.

Implementation:

Amazon EventBridge bus as the canonical event backbone.
Lightweight Lambda normalizers that convert each vendor payload to a common event schema.

2.2 Orchestration API

Responsibilities:

Start runs (POST /runs)
Query run status (GET /runs/<built-in function id>)
Retrieve artifacts (GET /runs/<built-in function id>/artifacts)
Apply human approvals (signed decisions)

Security:

Cognito for humans; IAM roles for services.
Fine-grained permissions by run type: spec, code, ops, docs, economics.

2.3 Workflow Engine

Minimal viable choice: AWS Step Functions.

Durable orchestration
Retries/backoff
Clear run state machine
Integrates well with ECS tasks and Lambda tool calls

Workflows (examples):

SpecChangeRun
ImplementationRun
ConformanceUpdateRun
ReleasePrepRun

2.4 Agent Runtime

Prefer ECS Fargate:

Per-run isolation
No cluster ops overhead
Easy IAM task roles
Controlled egress

Each agent task:

Runs from a pinned container image
Mounts workspace (ephemeral EFS or S3 checkout)
Has a strict “tool contract”
Logs to CloudWatch + writes structured log entries to S3

2.5 Model Router

A small policy engine:

Chooses provider/model based on role + task + repo classification
Enforces:
- Max tokens
- Max cost
- Disallowed data classes (secrets, private keys)
- Approved providers per repo sensitivity

Recommended pattern:

A simple service (router) with rule file stored in Git:
- routing-policy.yaml
Deploy as ECS service behind internal ALB.

2.6 Tool Services (Deterministic + Repo Operations)

Key rule: LLMs do not canonicalize or hash. Tools do.

Tools exposed to agent containers via local binaries or service calls:

locate-sdd (spec lint, build-ir, verify, watcher-check, simulate-envelope)
git operations wrapper (read-only vs write-scoped)
diff summarizer and structured change classifier
optional: solc/forge/hardhat, cargo test, go test

2.7 Artifact Store + Run Ledger

S3: artifacts (patches, docs, test vectors, IR outputs, conformance reports)
DynamoDB: run metadata, state, pointers, approvals
KMS: encryption keys; per-run data keys
Optional: QLDB for tamper-evidence for governance-critical runs

Artifacts are content-addressed when possible:

store sha256 for each artifact
store spec commit hash + IR hash

3. Trust & Safety Boundaries

3.1 Secrets Handling

Secrets stay in AWS Secrets Manager.
Agents receive short-lived credentials via IAM task role.
LLM prompts must never contain secrets.
Tool output that might contain secrets is redacted before model calls.

3.2 Repo Write Permissions

Use GitHub App / GitLab bot accounts with repo-scoped access.
Split permissions:
- read-bot for analysis
- write-bot for branch + PR/MR only (no merge)
Enforce: orchestrator cannot push to main.

3.3 Deterministic Gates

Orchestrator must always run:

locate-sdd spec lint
locate-sdd build-ir
locate-sdd verify
locate-sdd watcher-check (when relevant) before opening PR/MR (or attach failure report).

4. Minimal Data Model (Run Ledger)

Run record fields:

run_id
run_type (spec_change, impl_change, conformance_update, release_prep, watcher_investigation)
source (jira_issue_key, pr_url, mr_url)
repo_scope (public, proprietary)
inputs (links + hashes)
policy_version
model_calls[] (provider, model, purpose, pointers)
artifacts[] (s3_uri, sha256, type)
approvals[] (who/when/decision)
status (queued/running/awaiting_human/failed/succeeded)

5. Deployment Topology (Minimal)

VPC with private subnets for ECS tasks
NAT Gateway (or egress proxy) with allowlisted destinations
API Gateway public (or private)
EventBridge + Step Functions
S3 + DynamoDB + KMS

Observability:

CloudWatch Logs + Metrics
X-Ray tracing for API/workflows
Alarms: run failure rate, provider errors, CI mismatch rate

6. Build-vs-Buy Guidance

Build (small):

Event normalization
Run ledger
Routing policy engine
Step Functions workflows
Repo bot integration
Deterministic tool integration (locate-sdd)

Use managed:

Bedrock models where appropriate
Managed Jira/Git providers
CloudWatch

Avoid monolithic “agent platforms” that cannot guarantee:

dual GitHub+GitLab
tool-first determinism
immutable run logging
strict policy routing

7. Minimal MVP Plan (4–6 weeks)

Week 1–2:

EventBridge + webhook ingestion (Jira + GitLab)
Run ledger (DynamoDB) + artifacts (S3)
Step Functions skeleton for SpecChangeRun

Week 3–4:

ECS agent runner with locate-sdd
Repo bot integration (branch + MR creation)
Model router v1 (static policy file)

Week 5–6:

Jira comment/status updates
Approval gates (“awaiting human”)
Extend to GitHub PRs and ImplementationRun

8. Interface to locate-sdd-core

The orchestrator treats locate-sdd-core as a black box:

Inputs: repo checkout + conformance corpus version
Outputs: pass/fail + artifacts + hashes

This separation preserves discipline and prevents vendor coupling.

Reference architecture (source draft)​

Scintilla Locate — AWS Agent Orchestrator Architecture (Minimal, Production-Grade)

0. Design Doctrine​

1. High-Level Architecture​

2. Core Components​

2.1 Event Ingestion​

2.2 Orchestration API​

2.3 Workflow Engine​

2.4 Agent Runtime​

2.5 Model Router​

2.6 Tool Services (Deterministic + Repo Operations)​

2.7 Artifact Store + Run Ledger​

3. Trust & Safety Boundaries​

3.1 Secrets Handling​

3.2 Repo Write Permissions​

3.3 Deterministic Gates​

4. Minimal Data Model (Run Ledger)​

5. Deployment Topology (Minimal)​

6. Build-vs-Buy Guidance​

7. Minimal MVP Plan (4–6 weeks)​

8. Interface to locate-sdd-core​