Start with a 14-day Pro trial.
Download AiBenchLab and test Pro features for 14 days. After the trial, continue in Limited Mode or upgrade to keep full Pro access.
Lifetime license — not a subscription. Your benchmark data stays on your machine.
Founding 100 pricing
The first 100 customers lock in 50% off Pro or Agency lifetime licenses.
Pro
For developers, evaluators, and small teams testing AI models seriously.
regularly $899 lifetime
optional updates $199/year
- ✓ Local Windows desktop app
- ✓ Zero telemetry
- ✓ Test local + cloud models
- ✓ All 11 benchmark domains
- ✓ Reproducible scoring + reports
No credit card required. After trial, continue in Limited Mode or upgrade.
Agency
For consultants, agencies, and AI service providers delivering client work.
regularly $4,999 lifetime
optional updates $999/year
- ✓ White-label PDF reports
- ✓ Custom logo + Prepared For
- ✓ CLI / API / MCP access
- ✓ Client-ready exports
- ✓ Batch report production
Includes Pro features plus consultant-grade delivery tools.
Enterprise
For organizations that need compliance, auditability, and managed deployment.
annual contract · custom terms
- ✓ Complete audit trail
- ✓ Tamper-evident integrity
- ✓ Chain of custody reports
- ✓ SIEM / compliance exports
- ✓ Offline activation + SLA
Includes Agency features plus enterprise compliance and support.
Full feature comparison.
| Feature | Limited (after trial) | Pro | Agency | Enterprise |
|---|---|---|---|---|
| Core Benchmarking Engine | ||||
| Eval engine · composite Score · domain breakdown | ✓ | ✓ | ✓ | ✓ |
| Per-test scoring · pass/fail · anomaly detection | ✓ | ✓ | ✓ | ✓ |
| Empirical context-window validation | ✓ | ✓ | ✓ | ✓ |
| Fixed-seed reproducibility · hardware fingerprint | ✓ | ✓ | ✓ | ✓ |
| Latency metrics · hardware-efficiency analysis | ✓ | ✓ | ✓ | ✓ |
| Wizard · Recommendation · Models | ||||
| 8-step wizard · live monitor · in-wizard cost estimate | ✓ | ✓ | ✓ | ✓ |
| Recommendation engine (VRAM fit · intent · budget) | ✓ | ✓ | ✓ | ✓ |
| All local models (llama.cpp · Ollama · LM Studio · CUDA · Vulkan) | ✓ | ✓ | ✓ | ✓ |
| All cloud models — bring your own key | ✓ | ✓ | ✓ | ✓ |
| Full model catalog & discovery | ✓ | ✓ | ✓ | ✓ |
| Wizard client-name field | — | — | ✓ | ✓ |
| Test Suites | ||||
| Reasoning · Code · Chat (3 core) | ✓ | ✓ | ✓ | ✓ |
| Deployment Risk · Adversarial Safety · Tool Calling | — | ✓ | ✓ | ✓ |
| Multimodal · Multi-Turn · Agentic | — | ✓ | ✓ | ✓ |
| Per-run enable/disable tests | ✓ | ✓ | ✓ | ✓ |
| Custom suite creation from built-in tests + save + version lock | — | ✓ | ✓ | ✓ |
| Custom test creation (consultant-grade) | — | — | ✓ | ✓ |
| Comparison & History | ||||
| Models compared per session | 3 | 10 | Not limited* | Not limited* |
| Side-by-side · domain-by-domain · mix local+cloud | ✓ | ✓ | ✓ | ✓ |
| Saved session history (local, your data) | Not limited* | Not limited* | Not limited* | Not limited* |
| Export & Reporting | ||||
| PDF report branding | Watermarked | Branded | White-label | White-label |
| Full professional report (cover · charts · latency · cost · appendices) | — | ✓ | ✓ | ✓ |
| CSV export (spreadsheet analysis) | — | ✓ | ✓ | ✓ |
| MBX export (verification artifact) | ✓ | ✓ | ✓ | ✓ |
| Standalone cost-report PDF (AiBenchLab-branded) | — | ✓ | ✓ | ✓ |
| JSON full-audit export (the report-generation feed) | — | — | ✓ | ✓ |
| Batch export (mass report production) | — | — | ✓ | ✓ |
| Custom logo · "Prepared For" client name · business profile | — | — | ✓ | ✓ |
| Commercial client-delivery rights | — | — | ✓ | ✓ |
| Signed MBX Attestation (cryptographic proof) | — | — | — | ✓ |
| Automation — Run Queue | ||||
| Manual run queue + controls (stop / delete / requeue) | ✓ | ✓ | ✓ | ✓ |
| Headless Automation — CLI · API · MCP (separate download) | ||||
| CLI — headless runs · scripting · JSON out | — | — | ✓ | ✓ |
| REST API — execute · list · export · status | — | — | ✓ | ✓ |
| MCP integration — remote control from AI tools | — | — | ✓ | ✓ |
| Hardware Safety · Security · Setup | ||||
| Thermal protection · cooldown · VRAM monitoring | ✓ | ✓ | ✓ | ✓ |
| Tamper detection · encrypted keys · Ed25519 · local-only | ✓ | ✓ | ✓ | ✓ |
| Component manager · on-device judge · first-run setup | ✓ | ✓ | ✓ | ✓ |
| Built-in plugin management + integrity validation | ✓ | ✓ | ✓ | ✓ |
| Enterprise Compliance & Audit | ||||
| Complete audit trail (every action logged) | — | — | — | ✓ |
| Tamper-evident integrity (SHA-256 hash chain) | — | — | — | ✓ |
| Chain of custody reports | — | — | — | ✓ |
| Audit CLI commands | — | — | — | ✓ |
| SIEM-compatible exports (JSON/CSV) | — | — | — | ✓ |
| Integrity verification command | — | — | — | ✓ |
| Offline activation (air-gapped deployment) | — | — | — | ✓ |
| Deployment & Licensing | ||||
| Seats | 1 | 1 | 3 | 10+ (custom) |
| Custom test plugins | — | — | — | ✓ |
| Procurement (PO/invoice · MSA · custom terms) | — | — | — | ✓ |
| Support SLA · account manager · onboarding | — | — | — | ✓ |
| Commercial | ||||
| License model | Limited Mode | Lifetime | Lifetime | Annual |
Frequently Asked Questions
After the 14-day Pro trial, AiBenchLab continues in Limited Mode with restricted features and watermarked outputs. You keep access to 3 core test domains (Reasoning, Code, Chat). Upgrade to Pro anytime to restore full access to all 11 domains, professional reports, and advanced features.
No. Pro and Agency are lifetime licenses — pay once, own forever. Optional annual updates ($199/yr for Pro, $999/yr for Agency) keep benchmarks, reports, and compatibility current as AI models evolve. Enterprise is sold as an annual contract with updates included.
No. AiBenchLab is a local Windows desktop app. All benchmark data stays on your machine. There is no cloud sync, no telemetry, no account required. For cloud AI providers (OpenAI, Anthropic, etc.), prompts are sent to their APIs as part of normal usage — but results are stored locally.
You pay once and own that version of AiBenchLab forever. This is not SaaS — there is no server on our end running your benchmarks, no account that gets deactivated. Your data never leaves your machine unless you explicitly use a cloud AI provider. AiBenchLab runs on your hardware, stores data in your local database, and works without an internet connection.
Updates are optional. Pro renewal is $199/yr; Agency renewal is $999/yr. This keeps you current with version upgrades, new test suites and domains, new provider integrations, and security patches. If you don't renew, nothing breaks — your current version continues to work exactly as it did. If you renew later, you pick up where you left off.
Pro is licensed for internal use and business model evaluation — including company teams testing their own models. Agency unlocks white-label reports, commercial client-delivery rights, headless automation (CLI/API/MCP), and batch export. If you're delivering paid reports to clients, you need Agency.
Yes. You pay only the difference between your current tier and the new one.
30-day satisfaction promise. Talk to us within the first 30 days, and if we can't solve it, you get a full refund. Fair enough?
The first 100 customers get 50% off lifetime licenses: Pro for $449 (regularly $899) and Agency for $2,499 (regularly $4,999). Founding members also get direct access to the developer and priority on feature requests.
The Founding 100 counter on the pricing page shows how many spots remain. Once 100 licenses are sold, the program closes and prices return to standard rates.
Pro includes all 11 test domains: Reasoning, Coding, Chat, Deployment Risk, Adversarial Safety, Tool Calling, Multimodal, Multi-Turn Adversarial, Agentic, Agentic Email, and Context Retention. Limited Mode (post-trial) includes the 3 core domains: Reasoning, Code, and Chat. Agency adds custom test creation for consultant-grade evaluations.
For local models (Ollama, LM Studio): No. Everything runs locally. Your prompts, responses, and results never leave your machine. For cloud providers (OpenAI, Anthropic, Gemini, Grok, Groq): Test prompts are sent to their API as part of normal API usage. Results are stored locally.
Windows 10/11 (64-bit) at launch. macOS and Linux builds are planned.
For testing cloud models (OpenAI, Anthropic, etc.), no — any machine works. For testing local models, you need whatever hardware those models require (typically a GPU with sufficient VRAM). GPU Fit detection in the Model Catalog helps you find models that fit your hardware.
Because there is no other tool that does what AiBenchLab does. Public benchmarks test on cloud servers under ideal conditions. AiBenchLab is the only professional-grade benchmarking application that runs 998 scoring dimensions across 254 tests and 11 domains on YOUR hardware, with deployment risk scoring, forensic reporting, and deterministic evaluation. You're not comparing this to a $10/month SaaS — you're comparing it to the cost of deploying the wrong model in production.
All providers are available on every tier, including Free. Local: Built-in llama.cpp server, Ollama, LM Studio, LocalAI. Cloud: OpenAI, Anthropic, Google Gemini, xAI (Grok), Groq. Plus any custom OpenAI-compatible endpoint. That's 10 providers on day one.
"Not limited" is not the same as "unlimited." It means no artificial app cap — bounded only by your disk (history) or hardware (live comparison). Where we do cap, you see the number (Free 3 / Pro 10). We never advertise "unlimited" — an honest measurement tool doesn't make claims it can't measure.
Ready to benchmark?
Start your 14-day Pro trial today. No credit card required.
After trial, continue in Limited Mode or upgrade to keep full Pro access.