Database Workflow¶
Purpose¶
This note describes how imported legacy data should be queried, edited, pruned, and reconsolidated into reporting outputs as the server moves from filesystem metadata to PostgreSQL.
Read model¶
Use PostgreSQL as the system of record for metadata and workflow state.
Read paths should converge on:
customersandbilling_profilesfor reusable operational identity dataoperatorsfor reusable assessor identityjobsfor current job identity and statusjob_roundsfor working review historyround_recordingsandround_imagesfor uploaded media metadatajob_finalsfor archived final and correction snapshotsartifactsfor file referencesjob_eventsfor audit/history
Write model¶
Do not edit archived output artifacts directly.
Editing rules:
active work updates database rows
uploaded media files remain immutable artifacts on disk
archived final rows are immutable
correction rows may be overwritten as the current correction copy
generated PDFs, DOCX, and GeoJSON are replaced by regenerating them from the current final/correction payload
In practice:
metadata is edited in PostgreSQL
artifacts are regenerated or replaced at known paths
the database stores the authoritative payload and references the artifact files
Pruning model¶
Pruning should happen only after a job is archived and the final/correction snapshots are intact.
Keep permanently:
jobsjob_finalsartifactsfor retained outputsjob_eventsany assignment/auth history needed for audit
Prune candidates after archive:
job_roundsnot referenced by the retained final/correction snapshotsround_recordingsandround_imagestied only to pruned roundstransient manifest/review artifacts tied only to pruned rounds
The current schema keeps working rounds so pruning policy can be tested before any deletion logic is automated.
Reporting model¶
The reporting structure should be driven from job_finals.payload.
Why:
archived final/correction snapshots are the legal/reporting record
current tests already show that
job_record.jsonand archived final lineage can disagreefinal/correction payloads contain the consolidated form/transcript/report data
That means:
reporting exports should read from
job_finalscustomer/billing/admin lookup should read from normalized operational tables
round tables support provenance, troubleshooting, and edit history
they are not the long-term reporting source
Read-only query tool¶
Use:
tools/query_imported_jobs.py
Current query presets:
summaryarchived-finalsround-mismatchesmedia-by-jobpruning-candidatesreport-projectionnormalized-entities
These queries exist to test the schema against real imported jobs before live runtime code is moved to PostgreSQL.