Evaluating the best AI providers for our daily operations
Three evaluation pillars across two rounds of providers.
Weighted evaluation across all categories.
| Category | Weight | |||
|---|---|---|---|---|
| Coding & Logic | 45% | 10 | 9 | 7 |
| Cross-Repo Context | 25% | 8 | 6 | 10 |
| Team Collaboration (Drive/Slides/Doc) | 20% | 7 | 7 | 10 |
| Data Analysis & Scripting | 10% | 7 | 10 | 8 |
| TOTAL SCORE | 100% | 8.75 | 8.15 | 8.45 |
Weighted evaluation across all categories.
| Category | Weight | ![]() | |||
|---|---|---|---|---|---|
| Software & Architecture | 40% | 9.0 | 8.5 | 10.0 | 8.5 |
| Data & Strategy | 30% | 9.5 | 9.5 | 8.0 | 8.5 |
| Deliverables & Collateral | 15% | 10.0 | 8.0 | 9.0 | 6.5 |
| Cost & Latency (TPS) | 15% | 6.5 | 7.5 | 9.0 | 10.0 |
| Total Score | 100% | 8.85 | 8.55 | 9.15 | 8.30 |
Sources and references used in this evaluation.
Naura Data Team · LLM Selection · 2026