Archive Retention

Purpose

This note defines the current archive retention policy for final and correction work.

Policy

When a job is archived:

Keep

  • immutable original final snapshot

  • current correction snapshot, if present

  • transcript text for the final round

  • transcript text for the correction round, if present

  • generated outputs:

    • final JSON

    • final/correction TRAQ PDF

    • final/correction report PDF

    • final/correction report DOCX

    • final/correction GeoJSON

  • retained report images and image artifacts referenced by final/correction

  • audit/event history

Prune

  • raw audio files from archived rounds

  • correction audio after correction transcript exists

  • review JSON for archived rounds

  • working rounds not referenced by the retained final/correction provenance

Why

The reporting and legal record is the retained final/correction snapshot plus its transcript and generated outputs. Raw audio is processing input, not part of the retained archive.

Implementation boundary

The current implementation is descriptive and testable, not yet destructive.

Code: - app/archive_policy.py

Tests: - tests/test_archive_policy.py

Query support: - tools/query_imported_jobs.py archive-retention