← RAGGR™ Demo
RAGGR™ LLM Benchmark
📋 API Docs
BodhanAI ✍
RAGGR
™
LLM Benchmark
Compare answer quality across AI models — same documents, same retrieval
Pool: genai_rag
Select Models
Choose which models to benchmark — results use identical FAISS retrieval
Eval Questions
Built-in (17 questions — E1-E5, H1-H10, OOS1-2)
Upload JSON eval file
Drop JSON file here or click to browse
Filter:
All (17)
Easy (5)
Hard (10)
Out-of-Scope (2)
▶ Run Benchmark
Running Benchmark...
0/0
🏆
Winner:
Per-Question Breakdown
Question
Type
Diff
Model
Correct?
Cited?
Faithful
Latency
Cost
Download JSON Report
Copy Markdown Summary
Past Runs
Click to load any previous benchmark
Loading...