Server Architecture Notes¶
Managed Extraction Context by Recording Granularity¶
The server’s extraction accuracy strategy is to keep context narrow by processing recordings at section-level granularity.
How context is established:
The user records under a selected section label in the client, and/or
The user states section context in audio.
Each uploaded recording is tied to its section context and processed in that scope, rather than feeding one large mixed transcript through extraction.
Tradeoff:
Backend load and API traffic are higher (typically one upload call per recording, plus manifest/submit processing calls).
In return, extraction precision is significantly higher and user time-on-task is lower due to reduced post-processing corrections.
Operational impact:
More artifacts to track (recording_id, per-section transcript cache).
Better deterministic merges into section-specific form structures.
Cleaner review loops when users iterate with additional targeted recordings.
Round Reconciliation Boundary¶
Phase 1 timeout/retry hardening keeps the current synchronous workflow model, but makes round reconciliation more explicit.
The design rule is:
the round is the reconciliation unit
That means:
upload identity remains stable-ID based (recording_id, image_id)
accepted artifact state is read from DB-backed round metadata
client_revision_id is persisted as round state rather than surviving only inside derived review/final payloads
clients may use the round read surface to reconcile ambiguous timeout or interruption outcomes
Current client-facing round recovery route:
GET /v1/jobs/{job_id}/rounds/{round_id}
This route is intended to answer:
what round state is authoritative now
which artifact IDs the server has accepted for that round
which client/server revision identifiers are current
whether the round is accepted, processing, completed, or failed
The important boundary is that this does not introduce a server-side queue or a generic async operation resource. It clarifies reconciliation around the existing round lifecycle.
Standalone Tree Identification¶
Tree identification is intentionally outside the job/round/review/final workflow.
The boundary is:
route layer in
app/api/tree_identification_routes.pyservice layer in
app/services/tree_identification_service.py
Responsibilities:
the route accepts authenticated multipart image uploads and translates service exceptions to HTTP
the service validates input, calls the upstream Pl@ntNet API, and normalizes the response to the server contract
This keeps tree identification as a standalone utility capability rather than a hidden side effect of review or finalization flows.