Legacy Import¶
Purpose¶
The legacy importer is the first non-runtime validation step for the PostgreSQL migration. It allows the schema to be exercised against real job data without changing the live server workflow.
Tool¶
tools/import_legacy_jobs.py
Current import scope¶
The importer reads the legacy job layout and loads it into PostgreSQL:
job_record.jsonrounds/*/manifest.jsonrounds/*/review.jsonrecording
*.meta.jsonimage
*.meta.jsonfinal.jsonfinal_correction.jsonartifact paths for uploaded media and generated outputs
What this phase is for¶
This phase is not yet the runtime migration. It is for:
validating the schema against real jobs
discovering missing columns or weak table boundaries
testing import logic for jobs, rounds, finals, and artifact indexing
building confidence before replacing any file-backed runtime metadata path
Initial success conditions¶
The first phase is successful when all of the following are true:
Schema bootstrap works - PostgreSQL tables can be created from the current SQLAlchemy models.
Real jobs import cleanly - legacy jobs under the legacy storage root import without fatal errors.
Imported counts match filesystem reality - job count in PostgreSQL matches imported job directories - round count matches
rounds/*directories - recording/image counts match section metadata files - final/correction counts match the archived files on diskArchived snapshots are queryable
final.jsonandfinal_correction.jsonland in the database as retained snapshots.
Artifact indexing is usable
uploaded audio, transcript text, uploaded images, report images, PDFs, DOCX, and GeoJSON are represented as artifact path records.
No runtime behavior changes
the live server runtime remains independent from the importer while the import path is being developed and validated.
Recommended verification queries¶
After import, verify:
jobs by status
archived jobs with finals
jobs with correction snapshots
rounds per job
recordings per section
artifacts per job/final/round
job number uniqueness
Next phase after success¶
Once the importer and schema are stable, begin replacing runtime metadata areas incrementally:
device auth and tokens
job metadata and assignments
round metadata and manifests
media metadata
final and correction metadata