ARCHITECTURE INFOGRAPHIC  ·  2026-04-25  ·  master @ fab6a2c1

somaCURA isn't an AI scribe. It's a clinical reasoning engine that happens to produce notes.

Stack FastAPI · Python 3.12 async · SSE streaming · SQLite (WAL) · ChromaDB Default model claude-opus-4-6 · Anthropic + Google providers Deployment single-host · systemd · no Docker Honest disposition known gaps in § in flight
0LLMCalls during fragment routing
~80%Deterministic accumulation
<5msRouter latency target
2callsLLM hops per finalized note
3gatesPhysician decision points
39lensesProblem→metric maps
28calcEmbedded calculator methods
202labsCanonical lab definitions

The Reasoning-Artifact Principle

The clinical note is a reasoning artifact, not a transcription byproduct.

SCRIBE what we are not

The ambient scribe hands the LLM the entire conversation and asks for a note; the doctor proofreads.

audio → transcript → LLM → prose → physician proofreads

somaCURA what the architecture enforces

Physicians input structured fragments (text or voice). A deterministic ontology routes ~80% of those fragments to the right problem in <5ms with zero LLM calls. The model is invoked exactly twice during finalization — once for narrative problem detection (typed list, not prose), once per problem for the A&P section — and both calls receive only pre-routed evidence. The hallucination surface is one problem's prose, never the full note. The doctor approves the problem list, edits each A&P inline, and signs.

fragment → router → problems → evidence → scoped LLM → physician approves → sign

§ 01  ·  the canonical flow

What happens when a physician generates a progress note

POST /api/v1/encounters/{id}/progress orchestrates the largest pipeline in the system — generators/progress_generator.py at 6,749 lines, with ProgressNoteGenerator.stream_generate() as the entry point. It is also the clearest embodiment of the reasoning-artifact principle: deterministic accumulation, scoped LLM bursts, physician decision points at three stages. The green path below is the deterministic majority. The amber arrows are the only places an LLM runs.

Progress note generation — request trace stream_generate · generators/progress_generator.py:2096 · SSE
somaCURA progress note generation flow Physician inputs structured clinical fragments (text or transcribed voice). HyperDrive NER parses the fragments into typed entities (labs, vitals, medications, timestamps). The Clinical Update Router maps known entities to existing problems via lab-and-vital ontology in under 5 milliseconds with no LLM. The Problem Detector Service runs an LLM call against narrative text to surface new problems. The physician approves the problem list — first decision point. The Evidence Graph Builder links labs, vitals, and medications to problems deterministically. RAG retrieval pulls relevant guidelines and protocols from ChromaDB. The Note Compiler then runs a scoped LLM call per problem, producing the assessment-and-plan section using only pre-routed evidence. The physician reviews each per-problem A and P inline — second decision point. The compiled note is rendered. The physician signs and finalizes — third and last decision point. ▸ Phase 1 — deterministic accumulation · 0 LLM calls · ~80% of fragments Physician input structured fragment text · voice · pasted EHR HyperDrive NER typed entity extraction hyperdrive/ner.py · 837 lines ClinicalUpdateRouter lab/vital → problem in <5ms clinical_update_router.py:328 Lab/Vital Ontology 39 lenses · 202 canonical labs problem_lenses.py · lab_mapping.py Patient State problems · evidence · longitudinal SQLite WAL · per-encounter finalize requested ▸ Phase 2 — scoped finalization · 2 LLM calls · ~2-10s ProblemDetectorService LLM #1 · narrative → problems problem_detector_service.py:53 Physician approves decision point #1 accept · reject · merge EvidenceGraphBuilder labs/vitals/meds → problems evidence_graph.py:398 RAG retrieval guidelines + protocols rag.py · ChromaDB · per-problem per-problem context NoteCompiler · per problem LLM #2 · A&P prose · scoped note_compiler.py:549 · 925 lines Physician reviews each A&P decision point #2 · diff review accept · reject · edit inline Compiled note rendered A&P history · provenance SSE stream → note rail Sign & Finalize decision point #3 PUT /api/v1/notes/{id} ▸ A&P history appended → next progress note inherits longitudinal context
deterministic LLM (scoped) physician decision knowledge / RAG Total LLM calls per note: 2  ·  both scoped, neither generates a full note
§ 02  ·  the five pillars

What the philosophy looks like in code

The research position document — beyond-the-ai-scribe.html — defines five architectural pillars. Each one shows up in the codebase as a specific module. None is aspirational. Each cites the upstream research that motivated the design.

01 the structural shape

Problem-Oriented Clinical Course

An enumerated, evolving problem list is the chart's spine. Each problem carries status (active · improving · worsening · stable · resolved), a per-day A&P trail, and longitudinal evidence. Notes inherit prior-day A&P. The structure mirrors clinical reasoning, not conversation flow.

generators/progress_generator.py · services/problem_ap_sync_service.py · hyperdrive/problems.py

Rodman 2023 — POMR built as scientific framework; reasoning made visible. Kahn 2018 — structured templates improve accuracy and efficiency.
03 the knowledge layer

Knowledge-Augmented Compilation

RAG retrieval from versioned guidelines + hospital protocols at compile time. Per-problem evidence graph links data to problems before any prose is generated. The LLM never sees the full note; it sees one problem's pre-scoped evidence and writes one A&P section.

rag.py (ChromaDB vector store) · services/note_compiler.py:549 · knowledge/{guidelines,templates,variable}/

Chen 2025 — knowledge graphs reduce hallucinations 44% (5 medical datasets). He 2025 — Graph-RAG outperforms naive RAG; specialists rate it higher.
04 the human in the loop

Physician as Reasoning Engine

The doctor approves the problem list, feeds clinical observations, directs the assessment. The system never generates unsupervised clinical judgments. Physician review is the workflow itself — three discrete decision points before a note is signed.

static/js/census/note-generation.js · NoteGenerationManager · diff review at generators/progress_generator.py:2598

Hack 2025 — hybrid physician-AI notes: 79% unedited approval vs 23% AI-only (n=20, 10 blinded reviewers).
05 the computation layer

Embedded Clinical Computation

Math is computed deterministically, not hallucinated. 28 named calculator methods (AKI staging, eGFR CKD-EPI 2021, SOFA, APACHE II, MELD-Na, CHA₂DS₂-VASc, HAS-BLED, Wells PE/DVT, CURB-65, CIWA-Ar, NIHSS, GCS, Glasgow-Blatchford, HEART) plus a full ABG interpreter and a 5-component acuity model. The LLM never re-derives a calculation the system can do exactly.

clinical/calculators/clinical_calculator.py 3,455 lines · acid_base_analyzer.py 906 lines · services/priority_scorer.py 1,381 lines · clinical/domain/reference_data.py 1,442 lines / 202 canonical labs.

Verified at master @ fab6a2c1 — line counts measured by wc -l; calculator method count by grep -cE '^ def calculate_'.
§ 03  ·  the route atlas

Every endpoint, clustered by clinical workflow

Routes mapped from api/endpoints/*.py + somaNotes.py at master @ fab6a2c1. Clustered below by clinical workflow — the cluster shape, not the endpoint count, is the load-bearing observation. Every cluster corresponds to something a hospitalist actually does at the bedside.

somaCURA route atlas  ·  13 clinical clusters
GET POST PUT DELETE

Note Generation 8 endpoints

progress · H&P · discharge · transfer · cheatsheet · 2-liner

POST/generate/progress
POST/generate/handp
POST/generate/discharge
POST/generate/aandp
POST/generate/cheatsheet
POST/generate/2liner
POST/generate/transfer
POST/generate/hpi

Streaming Generation SSE

generation_stream.py · 89,119 bytes · the active path

POST/api/generate/stream
POST/api/progress-analysis/stream
POST/api/handp-analysis/stream
POST/api/progress-oneshot
POST/api/quick-discharge

Census & Patients 25+

patients.py · hyperdrive/router.py (5,685 lines)

GET/api/v1/patients
POST/api/v1/patients
POST/api/v1/patients/create-flexible
GET/api/v1/patients/{id}/chart
GET/api/v1/patients/{id}/labs
GET/api/v1/patients/{id}/medications
GET/api/v1/patients/{id}/orders
GET/api/v1/patients/{id}/notes
GET/api/v1/patients/{id}/encounters
GET/api/v1/hyperdrive/...

Clinical Computation scores

scores.py · acid_base.py · clinical_charts.py

POST/api/calculate/{calculator_name}
GET/api/calculators
POST/acid-base
POST/acid-base/from-text
POST/api/clinical/calculator/contingency
POST/api/clinical/calculator/monitoring
POST/api/clinical/allergies/check

Voice & Audio 21

Deepgram Nova-3 Medical · iPhone QR pairing · 21 voice routes

POST/api/audio/transcribe
POST/api/voice/edit
POST/api/voice-edit-v2
POST/api/voice/navigate
POST/api/voice/parse
POST/api/voice/pair
POST/api/voice/telemetry
POST/api/voice/admissions

Transfer Center 135 rules

transfer.py · transfer_voice.py · transfer_twilio.py · 135 routing rules

POST/api/v1/transfer/workspace
GET/api/v1/transfer/workspace/{id}
POST/api/v1/transfer/voice
POST/api/v1/transfer/twilio/...
POST/api/v1/transfer/extract

Fragments & Diffs workspace

fragments.py · diffs.py · editor.py · the compile pipeline

POST/api/v1/fragments
GET/api/v1/fragments/{encounter_id}
PUT/api/v1/fragments/{id}
DEL/api/v1/fragments/{id}
POST/api/v1/diffs
POST/api/v1/editor/...

Reasoning Tree v3

reasoning_tree.py · D3 visualization · admin toggle

POST/api/v1/reasoning-tree/generate
GET/api/v1/reasoning-tree/{id}
GET/api/v1/reasoning-tree/stats

Timeline & Events trends

event_timeline.py · timeline.py · timeseries.py · uPlot charts

GET/api/v1/timeline/{encounter}
GET/api/v1/event-timeline
GET/api/v1/timeseries/{metric}
GET/api/v3/charts/...

Chat & Decision Support peer

chat.py · cuti_decision.py · peer-attending

POST/api/v1/chat
POST/api/v1/chat/stream
POST/api/cuti/decision

Diagnostics & AI Dashboard obs

diagnostics.py · ai_dashboard.py · analytics.py

GET/api/diagnostics/health
GET/api/diagnostics/settings/models
POST/api/diagnostics/settings/models
GET/api/ai-dashboard/...
GET/api/analytics/...

Auth & Admin 10+

auth.py · users.py · admin_sessions.py

POST/login
POST/signup
POST/logout
GET/me
GET/admin/sessions/dashboard
GET/admin/metrics

Admissions & Workspace flow

admissions.py (52KB) · workspace.py · IR pipeline

POST/admissions
POST/admissions/voice
POST/api/v1/workspace/...
GET/api/v1/notes/{type}/latest
§ 04  ·  the per-patient data model

What somaCURA knows about a patient

The patient is the unit of state. Everything below hangs off patient_id and encounter_id in a single WAL-mode SQLite database. Two canonical Problem models — core/models.py:Problem with field problem_title for the API surface, and hyperdrive/models.py:Problem with field name for HyperDrive storage — converted via hyperdrive/adapters.py.

problem_list
core spine of the chart
  • problem_titlecanonical
  • statusactive · improving · worsening · stable · resolved
  • ap_historyper-day A&P · LIST
  • first_observed_dayhospital day
  • acuity5-component score
  • linked_evidencegraph IDs
evidence_graph
labs/vitals/meds → problems
  • EvidenceNodeprov_id stable
  • SourceTypeLAB · VITAL · MED · MICRO · IMG
  • LinkQualityontology · keyword · LLM · physician
  • VITAL_PROBLEM_MAP39-lens
  • LAB_PROBLEM_MAP202-lab
  • build_graph()graph builder
labs · canonical_labs
202 labs · 16 panel groups
  • CBC · BMP · CHEM · RENALcore panels
  • LFT · COAGS · ABG · LIPIDSextended
  • CARDIAC · ENDOCRINEspecialty
  • INFLAMMATORY · IRONworkup
  • UA · PANCREATIC · TOX · MICROaux
  • reference_rangelow · high · critical
vitals
trend-aware
  • hr · sbp · dbp · rr · tempcore 5
  • spo2 · fio2 · weightextended
  • urine_outputrenal lens
  • trend_directionup · down · stable
  • reference_rangeper-vital
medications
indication-linked
  • name · dose · routestructured
  • indication→ problem
  • started_daytimeline
  • renal_adjustmentCrCl-aware
  • allergy_checkpre-order
cultures
organism + sensitivity
  • specimen_typeblood · urine · sputum · CSF
  • organismidentified
  • sensitivitiesS · I · R
  • statusno growth · pending · positive
scores
28 calculators · auto-computed
  • aki_stageKDIGO
  • egfr_ckd_epi_2021renal
  • sofa · apache2 · curb65acuity
  • meld_na · child_pughhepatic
  • cha2ds2_vasc · has_bledafib
  • wells_pe · wells_dvtVTE
  • heart · gblatchfordcardiac · GI
  • acid_base_full15-fn analyzer
notes
version history · diff review
  • progress · handp · dischargecore types
  • transfer · cheatsheetops
  • 2liner · hpibrief
  • change_typefirst · update · ehr_reconciliation
  • prior_versionsJSONL
§ 05  ·  how intelligence stays scoped

The LLM is invoked twice. Both calls are bounded. Neither generates a full note.

An ambient scribe feeds the entire conversation to an LLM and asks for a note. The hallucination surface is the entire output. The doctor becomes a proofreader. The privacy model is "microphone in the exam room." Determinism: zero.

somaCURA's pipeline never lets the LLM see the full note. The first call — ProblemDetectorService.analyze_clinical_update_stream at services/problem_detector_service.py:181 — receives narrative text and returns a typed list of problem candidates. The physician approves them. The second call — per-problem prose generation in services/note_compiler.py:_build_problem_prompt at line 279 — receives one problem's pre-routed evidence (the labs the ontology mapped, the vitals from its lens, the relevant prior A&P, the RAG-retrieved guidelines) and writes one A&P section. Each problem is its own attack surface, and each surface is small.

Provider routing in services/llm_service.py: default claude-opus-4-6; reasoning tree, voice tier-2, IR extraction all on claude-sonnet-4-6; Gemini supported through the same service for chat and BHC synthesis. Haiku is banned by user direction (2026-04-24).

ADETfragment in → router maps to problem in <5ms (no LLM)
BDETHyperDrive NER extracts typed entities deterministically
CLLM 1narrative → problem candidates (typed list, not prose)
DDOCphysician accepts/rejects/merges candidates
EDETevidence graph + RAG per problem
FLLM 2per-problem A&P prose · scoped context only
GDOCphysician reviews each A&P · accept · edit · reject
HDOCsign & finalize
§ 06  ·  the diff review workflow

How the physician stays in control

Every progress note has three discrete physician decision points before a signature. Each one is a rendered diff against the previous state. The physician is never asked to proofread a wall of generated text — they are asked to accept, reject, or edit a small targeted change.

decision point #1 · problem detection

Approve the problem list

The LLM returns a typed list of problem candidates from narrative input. The physician sees each candidate, its evidence trail, and an inline action: accept · reject · merge into existing. No prose has been generated yet. The doctor is shaping the structural skeleton of the note before any sentence is written.

Implementation: analyze_clinical_update_stream SSE → frontend cards → NoteGenerationManager._handleProblemDetectionResult.

LLM call #1Latency ~2-4sOutput typed JSON
decision point #2 · per-problem A&P

Approve each assessment & plan

For each approved problem, the compiler produces an A&P diff (proposed text vs prior-day text). The physician reviews each problem's diff individually — accept the new prose, edit inline, or reject and keep yesterday's. The hallucination surface is one problem's prose, not the whole note.

Implementation: _stream_diff_proposals at generators/progress_generator.py:2598 → SSE proposed_change events → frontend per-problem diff cards.

LLM calls N (one per problem)Latency ~1-2s/problemOutput diff against prior
decision point #3 · sign & finalize

Lock the note & append A&P history

Once every section has been approved, the physician signs. PUT /api/v1/notes/{id} with change_type: "first" | "update" | "ehr_reconciliation". Pre-submit identity guard in EhrFinalModal.handleSubmit verifies this.patientId === CensusState.get('selectedPatientId') and bails on mismatch — patient-isolation contract enforced at the last possible second. On success, the note is locked, version history is preserved (JSONL), and ap_history is appended per problem so the next progress note inherits longitudinal context.

LLM calls 0Latency network onlyOutput locked note + appended history
§ 07  ·  behind the routes — services

The libraries every note generation step calls

Routes are thin. Most clinical reasoning lives in services/ and generators/. Each entry below is a real file at the line number cited, verified at master @ fab6a2c1.

generators/progress_generator.py

6,749 lines · class ProgressNoteGenerator at line 44

The orchestration heart. Inherits ClinicalContextMixin and BaseNoteGenerator. stream_generate() at line 2096 is the SSE entry point. _stream_diff_proposals() at line 2598 produces per-problem diff events. Holds the longitudinal logic that no rewrite has yet improved on.

Entry stream_generateSSE proposed_changeInherits ClinicalContextMixin

services/problem_detector_service.py

852 lines · class ProblemDetectorService at line 28

Two-phase problem detection. analyze_clinical_update_stream() at line 181 streams the LLM's typed problem list as SSE. _suggest_consolidations() at line 599 deterministically merges duplicate-looking problems before the physician sees them — Phase 2 is not LLM, it's set logic over canonicalized names.

LLM Phase 1 onlyPhase 2 deterministicOutput typed candidates

services/clinical_update_router.py

839 lines · class ClinicalUpdateRouter at line 328

The deterministic-only routing core. Header comment is the doctrine: "Do NOT import any LLM services here. This file is deterministic-only." Pulls LAB_PROBLEM_MAP + VITAL_PROBLEM_MAP from services/lab_problem_ontology.py and PROBLEM_LAB_MAPPING + LAB_ALIASES from hyperdrive/lab_mapping.py. Target latency <5ms.

LLM noneLatency <5msCoverage ~80% of fragments

services/evidence_graph.py

724 lines · class EvidenceGraphBuilder at line 398

Builds per-problem evidence graphs. EvidenceNode with SourceType (LAB · VITAL · MED · MICRO · IMG · FRAGMENT) and LinkQuality (ontology · keyword · LLM · physician). stable_prov_id() generates deterministic provenance IDs so the same lab→problem link gets the same ID across reruns — the chart's audit trail is reproducible.

Stable IDsSources 6 typesQuality 4 levels

services/note_compiler.py

925 lines

compile_progress_note() at line 549 is the deterministic assembly path. compile_progress_note_with_prose() at line 682 is the LLM-augmented path; both share the same PatientStateSnapshot input and NoteDraft output. _build_problem_prompt() at line 279 enforces per-problem scoping — the LLM sees one problem's evidence, never the chart.

Snapshot hashedProse per-problem onlyCoverage reported

services/llm_service.py

862 lines

Provider router + retry + circuit breaker. Singleton client pattern: get_anthropic_client() returns a 240s-timeout shared client. schedule_background() wraps fire-and-forget tasks with strong-ref + auto-discard + shutdown-awaited. Generators use the Anthropic client directly; Gemini routes through llm_service only (chat, BHC synthesis).

Default claude-opus-4-6Sonnet claude-sonnet-4-6Haiku banned

clinical/calculators/clinical_calculator.py

3,455 lines · 28 calculate_* methods

Embedded clinical math. calculate_aki_stage · calculate_egfr_ckd_epi_2021 · calculate_sofa_score · calculate_apache2 · calculate_meld_na · calculate_cha2ds2_vasc · calculate_has_bled · calculate_wells_pe · calculate_wells_dvt · calculate_curb65 · calculate_ciwa_ar · calculate_nihss · calculate_gcs · calculate_glasgow_blatchford · calculate_heart_score · calculate_anion_gap · calculate_delta_gap · calculate_corrected_calcium · calculate_osmolar_gap · calculate_fena · trend analytics · severity scoring.

Pure deterministicTrend-awareTime-aware

acid_base_analyzer.py

906 lines · 15 internal methods

Full ABG interpretation. analyze() resolves primary disorder, anion gap (with albumin correction), delta-delta, compensation rules, osmolal gap (when AG warrants), urine anion gap (in metabolic acidosis), corrected chloride. Generates differentials and recommendations from disorders, never from the LLM.

VBG correctionCompensationDifferentials rule-based

rag.py + knowledge/

ChromaDB vector store · 70 knowledge files

Three knowledge corpora. knowledge/guidelines/ — clinical documentation standards. knowledge/templates/ — note structure templates. knowledge/variable/ — hospital-specific protocols (averaProtocols, averaUnitSpecificTx, conditions, medications, references, research). Retrieval is per-problem at compile time, never per-token during generation.

Indexer ChromaDBLayered 3 corporaPer-problem

hyperdrive/ — clinical NER + ontology

16,400 lines · router 5,685 · ontology 1,760 · ner 837

The clinical data extraction subsystem. hyperdrive/ner.py parses lab values, vital signs, medications, timestamps. hyperdrive/ontology.py normalizes lab names, maps lab→problem, vital→problem. hyperdrive/consolidation.py deduplicates labs across systems. hyperdrive/router.py exposes 25+ census/patient endpoints. A file-backed cache stores parsed results across requests for sub-millisecond reads.

NER deterministicOntology 1,760 linesCache file-backed

clinical/domain/problem_lenses.py

649 lines · 39 lens entries

Maps problem names to clinically relevant metrics. Each lens defines hero (primary metric, lab or vital), labs[] (relevant lab keys, all canonical), vitals[] (relevant vital keys). Substring match against canonicalized problem name. Fallback "*" entry returns generic labs/vitals when no pattern matches. Mirrored in static/js/census/config/problem-lenses.js with parity test at tests/test_problem_lenses_parity.py.

Server lens PythonClient lens JS mirrorParity tested

services/priority_scorer.py

1,381 lines · 5-component acuity

Scores patient acuity for census sorting. Five components — clinical severity (vitals · scores · trajectory), evidence pressure (labs out of range), problem activity (new · worsening), workflow urgency (orders pending · consults), longitudinal drift (trend changes). Output is per-patient acuity used by the census view to surface the sickest patients first.

Components 5Used by census sortDeterministic
§ 08  ·  head-to-head

What changes in the architecture when you choose reasoning over transcription

The comparison from the research position document, with code references for every row. Each somaCURA cell points to the file that implements the contrasted behavior, verified at master @ fab6a2c1.

§ the decision rule

Knowledge belongs on the doctor's screen, not in LLM-generated prose. Every sentence in a finalized note must trace to an input — a lab, a vital, a medication, a clinical observation, a prior A&P. If a sentence cannot be traced, it shouldn't be there. When in doubt, generate less.

Audit yourself: does this change make the note more transcribed, or more reasoned? Only the second is legitimate. Adding more knowledge sources to the LLM prompt is the wrong instinct — more input means more material to hallucinate with. The right answer is almost always to surface the knowledge on the doctor's screen, not in the LLM output.

§ in flight · known gaps

This artifact was cut against master @ fab6a2c1 on 2026-04-25. Several load-bearing surfaces are mid-build or have known calibration gaps. A reader making architectural decisions against this doc should know:

  • CCE Phase 1 — fuzzy matcher can't bridge HyperDrive→note headers. Cross-Course Evolution shipped behind ENABLE_CCE=true in production with sub-millisecond classification latency, but every problem currently classifies as NEW because fuzzy matching cannot bridge HyperDrive canonical names to note A&P problem headers. Fix is documented: read ap_history[-1] directly instead of fuzzy-matching against HyperDrive labels. Not yet shipped.
  • Phase B (organizational thinking surface) is not built. Fragment capture (Phase A) shipped at commits f3ccbe42 + e7b324e8 — fragments accumulate, evidence graph builds, LLM compiles per-problem prose. The middle phase — where the doctor organizes fragments on a Problem Board before compiling — is the next milestone and has zero code yet. docs/plans/WORKING-CANVAS-BLUEPRINT.md.
  • Problem Board is not the default census view. 8,900+ lines of spatial workspace exist (problem-space.js, problem-card.js, kanban, spatial persistence) but it's not surfaced as the primary view a hospitalist sees on login. The list-style census remains the default. Migration is queued behind Phase B.
  • Formulary infrastructure (27K+ lines) is hidden during composition. Guidelines, calculators, hospital protocols all exist as services and as visible admin surfaces — but they are not rendered on the doctor's screen during note composition. The fix is product, not engineering — the data is already routable per-problem; the UI to surface it during composition has not been built.
  • Concurrency audit Tier S #5 is sudo-blocked. Patches at ops/somanotes.service.proposed-2026-04-23 would harden the systemd unit but require user-mediated sudo to apply. 18 commits across two sessions shipped Tier S #1-#4, Tier A, Tier B #6/7/10/13, plus the shutdown hook. Observability catch-all (heartbeat queue depth, pool checkout wait, breaker trips, SSE active-task count) still pending.
  • 2D problem-space scorer rewrite uncommitted. 9-agent Opus 4.7 sprint produced the code; baseline 58% dead-center → target <20%. Six bisectable commits pre-drafted, Python + Playwright tests pending. Resume doc: docs/plans/2026-04-22-001-problem-space-fix/SESSION-HANDOFF.md.
  • Auto-loading overnight data not wired. Census refresh works on demand but does not automatically pull overnight events (new labs · new vitals · new orders) into the doctor's morning view. Workflow gap, not data gap.

§ limits & counted-with

  • "~80% deterministic" is the design target stated in the file header of services/clinical_update_router.py — not yet measured in production telemetry. The router itself is deterministic-only; the proportion of fragments handled without an LLM in real practice is the open metric.
  • "<5ms router latency" is also a design target, not a measurement. The deterministic path has no I/O — match against in-memory ontology dicts — so the order of magnitude is right; the precise distribution is unmeasured.
  • Problem-detection F1=0.90 is upstream literature (Zhang 2025 · n=5,118 utterances · Omaha System framework), cited as the architectural target somaCURA's two-phase detection is designed against, not as a measurement of this codebase.
  • Hybrid-note approval rate 79% is upstream literature (Hack 2025 · n=20 · 10 blinded reviewers · 8-domain scoring). It motivates the physician-in-the-loop architecture, not internal data.
  • Calculator count: 28 named calculate_* methods in clinical/calculators/clinical_calculator.py; the broader "30+" framing in the research doc includes the acid_base_analyzer.py internal methods, the 5-component priority_scorer.py, and per-generator scoring helpers.
  • Problem lens count: 39, verified by the file's own header comment in clinical/domain/problem_lenses.py ("39 entries, mirrors problem-lenses.js exactly").
  • Canonical lab definitions: 202, counted via grep -cE '^\s+"[A-Z][A-Za-z0-9_]*":\s*_lab\(' clinical/domain/reference_data.py. LAB_REFERENCE_RANGES is a tighter subset; the research doc's "150+" is conservative.
  • Default model is claude-opus-4-6 (verified in core/settings.py:224); reasoning-tree, voice tier-2, and IR all ride claude-sonnet-4-6; Haiku is banned by user direction (2026-04-24).
  • Single-host deployment. No Docker, no Kubernetes, no orchestrator. One systemd unit, one process, one WAL-mode SQLite database, one ChromaDB on disk. Operational simplicity is part of the architectural argument — fewer moving parts means fewer places for the reasoning to drift.
Scope
All @router.* + @app.* decorators in api/endpoints/*.py and somaNotes.py. Service files mapped from services/ · generators/ · clinical/ · hyperdrive/. Pillars cross-referenced against https://somacura.icu/static/infographics/beyond-the-ai-scribe.html.
Method
Codebase-grounded build. Every file path, line number, function name, and code snippet was verified by reading the file at master @ fab6a2c1 before being placed in the document. Where research doc claims (35+ calculators, 150+ ranges) differed from measured counts, the measured count is shown with the qualifier.
As-of · source
2026-04-25 · master @ fab6a2c1 · 761 indexed Python files · 246 JS files · 70 knowledge files · counts verified against git HEAD and live grep at build time.
Generated
By Claude Opus 4.7 following the infographic-gen skill, sibling to the feb8 architecture infographic. Single-file HTML · zero-build · zero dependencies · print-safe · reduced-motion aware. Drop on any web server.
Audit
compression: passed (v2 audit, 2026-04-25) · 14 of 15 redline items applied; deferred items logged in COMPRESSION DEFERRED block at end of source.
somaCURA active routes by clinical domain
ClusterEndpoint countPrimary fileRole
Note Generation8generators/*.pyOne endpoint per note type · all POST
Streaming Generation5api/endpoints/generation_stream.pySSE · the active path
Census & Patients25+api/endpoints/patients.py + hyperdrive/router.pyPatient CRUD · chart · labs · meds · orders
Clinical Computationscores · acid-base · contingencyscores.py · acid_base.py · clinical_charts.py28 calculators wrapped as endpoints
Voice & Audio21voice_*.py · audio_transcription.pyDeepgram · iPhone QR · companion pairing
Transfer Centerworkspace + voice + twiliotransfer*.py135-rule routing advisor
Fragments & Diffsworkspacefragments.py · diffs.py · editor.pyFragment CRUD · evidence graph · compile
Reasoning Tree3reasoning_tree.pyD3 visualization · v3 generator
Timeline & Events4event_timeline.py · timeline.py · timeseries.pyuPlot trends · acuity heatmap
Chat & Decision Support3chat.py · cuti_decision.pyPeer-attending chat · CUTI
Diagnostics & AI Dashboard15+diagnostics.py · ai_dashboard.py · analytics.pyHealth · model config · usage
Auth & Admin10+auth.py · users.py · admin_sessions.pyLogin · session admin · metrics
Admissions & Workspaceflowadmissions.py · workspace.pyIR pipeline · note storage
Total239 active route registrations42 endpoint filesCounted with grep at master @ fab6a2c1
somaCURA five pillars and their code references
PillarNameImplementationResearch grounding
01Problem-Oriented Clinical Coursegenerators/progress_generator.py · services/problem_ap_sync_service.py · hyperdrive/problems.pyRodman 2023 (POMR as scientific framework) · Kahn 2018 (structured templates)
02Deterministic Evidence Routingservices/clinical_update_router.py:328 · clinical/domain/problem_lenses.py (39 lenses) · hyperdrive/lab_mapping.pyZhang 2025 (F1=0.90, n=5,118 utterances)
03Knowledge-Augmented Compilationrag.py (ChromaDB) · services/note_compiler.py:549 · knowledge/{guidelines,templates,variable}/Chen 2025 (44% hallucination reduction) · He 2025 (Graph-RAG advantage)
04Physician as Reasoning Enginestatic/js/census/note-generation.js (NoteGenerationManager) · diff review at progress_generator.py:2598Hack 2025 (79% hybrid vs 23% AI-only approval)
05Embedded Clinical Computationclinical/calculators/clinical_calculator.py (3,455 lines) · acid_base_analyzer.py (906 lines) · services/priority_scorer.py (1,381 lines) · clinical/domain/reference_data.py (1,442 lines)Verified file/line counts at master @ fab6a2c1
Architecture-level comparison with code references
DimensionAmbient AI ScribesomaCURAImplementation
Input sourceConversation audioStructured fragmentsservices/clinical_update_router.py
When LLM runsContinuouslyAt finalization, scoped per-problemnote_compiler.py:_build_problem_prompt:279
Problem awarenessNoneFull problem list with status evolutioncore/models.py:Problem + ap_history
Clinical contextCurrent conversationLongitudinal · prior A&P · trajectoriesgenerators/progress_generator.py
Evidence provenanceNonePer-problem evidence graph · stable IDsservices/evidence_graph.py:108 (stable_prov_id)
Knowledge baseLLM training dataVersioned guidelines + protocols (RAG)rag.py · knowledge/{guidelines,variable}/
Clinical calculationsNone (LLM may guess)28 calculators + acid-base + 5-component acuityclinical/calculators/clinical_calculator.py
Doctor's roleProofreaderReasoning director · 3 decision pointsNoteGenerationManager + EhrFinalModal
Hallucination surfaceFull notePer-problem prose onlynote_compiler.py:_format_evidence_by_source
Privacy modelAmbient microphoneIntentional input · zero default audioservices/audio_transcription_service.py (opt-in)
Determinism0%~80% deterministic routingservices/clinical_update_router.py (target)