Skip to content

Multimodal Figure Flow (Full Size)

Use browser zoom if needed for presentation or screenshots.

flowchart LR
    A["PDF figure or image detected"] --> B["Extract image region and metadata page and figure id"]
    B --> C["Generate image caption"]
    C --> D["Create caption embedding"]
    D --> E[(PostgreSQL plus pgvector)]
    F["User asks about a chart or figure"] --> G["Question embedding"]
    G --> E
    E --> H["Retrieve matching figure captions"]
    H --> I["LLM answers with figure grounded context"]
    E --> J["Admin reviews figure captions"]
    J --> K["Edit captions and hide nonsensical figures"]
    K --> E