Combobulating

ADR-0002 — Orchestrator Model: Phi-3-mini, with Device-Capability Matrix

Status

Accepted (2026-05-07). Source: plan.md §13, technical_spec.md §4 and §8.2.

Context

The orchestrator runs a LangGraph state machine on every tick that reaches the Deliberating state. Its job is to take typed JSON tool outputs from the four agents, rank candidate actions, decide whether to surface, and emit a Reasoning Trace. The model’s job is small — short rationale prose, occasional natural-language slot suggestions, occasional tool-call planning when no rule path resolves. It is not a chat partner.

Constraints:

Forces:

Decision

The orchestrator runs Phi-3-mini-4k-instruct, Q4 quantised, as the default model on all supported devices. Heavy fallback to Llama-3-8B Q4 is permitted only on Tier A devices when is_charging && battery > 50% && device_ram >= 8GB.

A device-capability matrix governs which model is loaded for which role:

Tier RAM Orchestrator Agent LLM Heavy fallback
A ≥8 GB (iPhone 14 Pro, S22 Ultra, S23+, Pixel 8) Llama-3-8B Q4 (gated) else Phi-3-mini Gemma 2B Q4 + LoRA Enabled
B 6–8 GB (iPhone 14, S22 base, A55) Phi-3-mini Q4 Gemma 2B Q4 + LoRA (hot-swap) Disabled
C 4–6 GB (mid-range Galaxy A, older Pixel) Gemma 2B Q4 Rule + DistilBERT Disabled
D <4 GB Unsupported Block install

Tier detection at first launch via ProcessInfo.processInfo.physicalMemory (iOS) or ActivityManager.getMemoryInfo() (Android), persisted to settings.

On Tier B and below, orchestrator and agent LLM share the model store on disk only; in RAM, the orchestrator is unloaded while a Gemma + LoRA pass executes, then reloaded. Cost: ~600 ms model load on every orchestrator invocation (technical_spec.md §8.1, [TEAM TO VERIFY]).

Consequences

Positive:

Negative / costs:

Alternatives

End of ADR-0002.