Multimodal Figure Flow (Full Size)
Use browser zoom if needed for presentation or screenshots.
flowchart LR
A["PDF figure or image detected"] --> B["Extract image region and metadata page and figure id"]
B --> C["Generate image caption"]
C --> D["Create caption embedding"]
D --> E[(PostgreSQL plus pgvector)]
F["User asks about a chart or figure"] --> G["Question embedding"]
G --> E
E --> H["Retrieve matching figure captions"]
H --> I["LLM answers with figure grounded context"]
E --> J["Admin reviews figure captions"]
J --> K["Edit captions and hide nonsensical figures"]
K --> E