Legacy Import
=============

Purpose
-------

The legacy importer is the first non-runtime validation step for the PostgreSQL
migration. It allows the schema to be exercised against real job data without
changing the live server workflow.

Tool
----

- ``tools/import_legacy_jobs.py``

Current import scope
--------------------

The importer reads the legacy job layout and loads it into PostgreSQL:

- ``job_record.json``
- ``rounds/*/manifest.json``
- ``rounds/*/review.json``
- recording ``*.meta.json``
- image ``*.meta.json``
- ``final.json``
- ``final_correction.json``
- artifact paths for uploaded media and generated outputs

What this phase is for
----------------------

This phase is not yet the runtime migration. It is for:

- validating the schema against real jobs
- discovering missing columns or weak table boundaries
- testing import logic for jobs, rounds, finals, and artifact indexing
- building confidence before replacing any file-backed runtime metadata path

Initial success conditions
--------------------------

The first phase is successful when all of the following are true:

1. Schema bootstrap works
   - PostgreSQL tables can be created from the current SQLAlchemy models.

2. Real jobs import cleanly
   - legacy jobs under the legacy storage root import without fatal errors.

3. Imported counts match filesystem reality
   - job count in PostgreSQL matches imported job directories
   - round count matches ``rounds/*`` directories
   - recording/image counts match section metadata files
   - final/correction counts match the archived files on disk

4. Archived snapshots are queryable

   - ``final.json`` and ``final_correction.json`` land in the database as
     retained snapshots.

5. Artifact indexing is usable

   - uploaded audio, transcript text, uploaded images, report images, PDFs,
     DOCX, and GeoJSON are represented as artifact path records.

6. No runtime behavior changes

   - the live server runtime remains independent from the importer while the
     import path is being developed and validated.

Recommended verification queries
--------------------------------

After import, verify:

- jobs by status
- archived jobs with finals
- jobs with correction snapshots
- rounds per job
- recordings per section
- artifacts per job/final/round
- job number uniqueness

Next phase after success
------------------------

Once the importer and schema are stable, begin replacing runtime metadata areas
incrementally:

1. device auth and tokens
2. job metadata and assignments
3. round metadata and manifests
4. media metadata
5. final and correction metadata