Format Specification · Open-Source Verifier

MBX — Tamper-Evident Benchmark Results

One file. Every test result. One SHA-256 seal. An open-source verifier on GitHub that anyone can audit, build, and run — no AiBenchLab install required.

AiBenchLab-MBX · format version 2.0 · .mbx.json

What MBX is

MBX — short for ModelBench eXport — is AiBenchLab's portable benchmark export format. An MBX file contains a complete benchmark session: every test result, the hardware profile of the machine that ran it, a configuration snapshot, and an executive summary — bundled into a single JSON file with a SHA-256 content integrity hash.

The format solves one problem: a consultant or procurement team runs an AI model benchmark on machine A and needs to deliver the results to a decision-maker on machine B, with reasonable assurance that the data was not altered in transit.

MBX files use the .mbx.json extension and are valid JSON. They can be opened in any text editor, parsed by any JSON library, and verified by any implementation of SHA-256 — including the standalone open-source tool we publish on GitHub.

What's inside an MBX file

Every field a downstream tool, auditor, or future-you might need.

Session data

Every test result — prompts, outputs, scores, latency metrics (TTFT, TPOT, TPS, E2E), anomalies, and per-domain breakdowns.

Hardware profile

Full CPU/GPU/memory/OS snapshot of the machine at export time. Reproducibility grounds truth.

Hardware fingerprint

Privacy-safe hash of the hardware class. No serial numbers, no MAC addresses — just a deterministic class identifier.

Config snapshot

Test suite ID, model IDs, sampling temperature, token limits — the exact parameters the run used.

Summary metadata

Executive summary fields — agentic readiness, responsiveness class, consistency rating, hardware efficiency, and "the one thing" takeaway.

Content hash

SHA-256 over a deterministic canonical representation of the entire package. The tamper seal.

The aibenchlab-verify CLI — open source, zero trust in us required

The verifier is a standalone command-line tool, written from the spec as an independent implementation of the verification procedure. It doesn't share code with AiBenchLab — which is the whole point. If it verifies your MBX file, that proof doesn't depend on trusting us.

DEPENDENCIES
3
clap, serde_json, sha2. Nothing else.
NETWORK / TELEMETRY
None
Runs fully offline. No calls home.
TESTS
14 / 14 passing
Independent of AiBenchLab's 470 library tests.

Usage

three modes
# Verify a single file
aibenchlab-verify session.mbx.json

# Full diagnostics
aibenchlab-verify --verbose session.mbx.json

# Verify every .mbx.json file in a directory
aibenchlab-verify --batch ./exports/

Example output

terminal
$ aibenchlab-verify session-2026-04-15.mbx.json

Format:           AiBenchLab-MBX  v2.0
App version:      0.7.1
Content hash:     a3f2b8c1...  (SHA-256)
Hardware class:   verified — no PII detected
Canonicalization: OK — keys preserved, floats normalized

VERIFIED — content hash matches.

Or roll your own — the algorithm is 20 lines

The hash methodology is documented in full — float normalization, canonical key ordering, UTF-8 encoding — so any language can reproduce the hash from the same data. Here's a reference verifier in pure-stdlib Python. Same approach works in JavaScript, Go, Rust, or anywhere you have a JSON parser that preserves key insertion order and a SHA-256 implementation.

verify_mbx.py
import json, hashlib

def normalize_floats(obj):
    if isinstance(obj, dict):
        return {k: normalize_floats(v) for k, v in obj.items()}
    if isinstance(obj, list):
        return [normalize_floats(v) for v in obj]
    if isinstance(obj, (int, float)):
        return f"{float(obj):.6f}"
    return obj

def verify_mbx(path):
    with open(path, "r", encoding="utf-8") as f:
        data = json.load(f)
    assert data["format"] == "AiBenchLab-MBX"
    assert data["format_version"] == "2.0"
    stored = data["content_hash"]
    data["content_hash"] = ""
    normalized = normalize_floats(data)
    canonical = json.dumps(normalized, separators=(",", ":"), ensure_ascii=False)
    computed = hashlib.sha256(canonical.encode("utf-8")).hexdigest()
    print("PASS" if computed == stored else "FAIL")

verify_mbx("benchmark.mbx.json")

What MBX does — and doesn't — guarantee

Does Provide
  • Tamper evidence — any edit after export invalidates the hash.
  • Self-contained verification — file + SHA-256 is all you need.
  • Cross-platform determinism — same data, same hash, any OS.
Does NOT Provide
  • Proof of origin — no cryptographic signature binding the file to a specific installation.
  • Proof of execution — only proves the file hasn't changed, not that the benchmark actually ran.
  • Protection against a malicious exporter fabricating results up front.

We're explicit about these limits because exaggerating them would undermine the point. MBX is a tamper-evidence seal — not chain-of-custody cryptographic proof.

Availability

MBX export is produced by AiBenchLab on the Pro, Consultant, and Enterprise tiers. The aibenchlab-verify tool and the MBX v2 specification are open source and free — because the whole point of tamper evidence is that anyone can audit it.