πŸ”₯ Repo Roaster

Paste any GitHub URL. The framework fetches the repo data, indexes it into the local Orama vector DB, and streams a Senior Architect roast β€” showcasing Local RAG + Speculative Streaming.

Local RAG Speculative Stream
πŸ”— GitHub Repository URL
Draft Stream   Speculative Stream
πŸ”₯ Waiting for a GitHub URL…
πŸ” RAG Context Retrieved
πŸ“š Vector search results will appear here

🧠 Knowledge Nexus

Index text or URLs into the live Orama vector database and watch chunks appear as nodes on the 2-D local retrieval map. Query to see similarity lines.

Vector DB Lite Embeddings
✍️ Index Text
🌐 Index URL
πŸ” Hybrid Query
πŸ“š Indexed Sources
πŸ“­ No documents indexed yet
Nodes: 0
Last embed: β€”

πŸ’¬ Sovereign Chat

Chat with a local-only LLM. Toggle "Technical Overlay" to reveal which tokens came from the Draft model vs the Speculative target.

100% Local Token Transparency
πŸ”’ Local Inference β€” Zero Data Egress
Draft: 0 Spec: 0 Local: 100%

⚑ Multi-Tab Stress Test

Launch 5 browser windows, each running simultaneous inference. The VRAM meter stays flat β€” proving all tabs share one SharedWorker instance.

VRAM Dedup SharedWorker
πŸ–₯️
Global VRAM Deduplication
5 windows. 1 ONNX model instance. The Multiplexer routes all requests through a single SharedWorker β€” the model never loads twice.
⬜Tab 1
⬜Tab 2
⬜Tab 3
⬜Tab 4
⬜Tab 5
πŸ“Š Live Metrics 0
πŸ“‹ Verification Artifact
πŸ”¬ Run the stress test to generate a verification report