Module: app.extractors.registry

Extractor registry and section dispatch for structured extraction.

Authors:

Roger Erismann (https://hammerdirt.solutions), OpenAI Codex

Purpose:

Centralize section-to-model/prompt wiring and expose one canonical run_extraction(section_id, transcript) entrypoint used by server runtime.

Design:
  • EXTRACTOR_CONFIG maps section ids to model classes and prompt files.

  • A shared system prompt (system_common.txt) is used by default.

  • Dispatch delegates execution to common.run_outlines_extraction(…).

class app.extractors.registry.ExtractorConfig(section_id, model_cls, section_prompt, system_prompt=None)[source]

Bases: object

Static registry row describing one section extractor contract.

Parameters:
  • section_id (str)

  • model_cls (Type[BaseModel])

  • section_prompt (str)

  • system_prompt (str | None)

section_id

Canonical section key.

Type:

str

model_cls

Pydantic model class expected from extraction.

Type:

Type[pydantic.main.BaseModel]

section_prompt

Prompt filename under section directory.

Type:

str

system_prompt

Optional section-specific system prompt filename. If None, registry uses system_common.txt.

Type:

str | None

app.extractors.registry.run_extraction(section_id, transcript)[source]

Run extractor dispatch for a section transcript.

Parameters:
  • section_id (str) – Section key to resolve in EXTRACTOR_CONFIG.

  • transcript (str) – Transcript text for that section.

Returns:

Parsed Pydantic model instance for the configured section.

Raises:
  • KeyError – Unknown/unregistered section id.

  • ValueError – Empty transcript (raised by common runtime).

  • RuntimeError – Missing API key/model configuration.

Return type:

BaseModel