v1.0 · Public Documentation

When the answer has to be right.

Certainize turns multi-engine AI + human expert verification into something you can build on: a live Certainty Score, independent model certification, and — new — human-verified ground-truth data to train & verify specialized models — for any team fine-tuning or evaluating a vertical model, frontier labs included. One API. Powered by ReallySolved™.

bash
# Quick start — fetch a brand's Resolution Score
curl -H "Authorization: Bearer YOUR_API_KEY" \
  https://api.certainize.ai/v1/brands/samsung/score
200 OK application/json
{
  "brand": "samsung",
  "resolution_score": 73,
  "open_truths": 3,
  "last_updated": "2026-05-24T14:22:00Z",
  "resolver_count": 22,
  "status": "active"
}

Everything you can do with Certainize

One capability — multi-engine AI answers, settled by human experts — powers four products. Enterprises verify and certify; support teams catch errors before customers do; AI labs buy the verified data to train on. Pick your path:

Available now

Verify & certify

Live Certainty Scores and independent model certification — expert-verified, not news-scraped. Used by hedge funds, legal/medical, market research, and brand safety teams.

See the Certainty Score →
Coming Soon

Support Supervisor

AI-powered customer support oversight. Monitors your support tickets for accuracy, escalates hallucinations to human Expert Solvers for correction.

Join the waitlist →
Coming Soon

Configure

Admin tooling for enterprise subscribers. Manage Expert Solver pools, set alert thresholds, configure webhook delivery, and audit usage.

Coming with institutional tier
Beta · Early access

Model Eval Data

A live model-integrity benchmark on factual ground truth — not preference votes — for any team fine-tuning or evaluating a vertical model, frontier labs included. A continuous feed of where leading models disagree plus the human-verified answer — ready-made evaluation, preference & hallucination data — and licensed vertical ground-truth corpora for domain fine-tuning.

Explore Model Eval Data →

The Certainty Score

One number, 0–100, that tells a person — or a system — how far to trust an AI answer. It's built from five weighted sub-scores, so you can act on AI output with a known confidence level instead of a guess.

Engine consensus Source traceability Expert validation Temporal freshness Conflict flags

Who we serve

FIN

Financial institutions

Hedge funds, PE, market research — where a single material AI error can run $50K–$2.1M.

MED · LEG

Legal & medical

Domains where "probably right" isn't good enough and provenance is non-negotiable.

ENT

Enterprise brands

Live brand Resolution Scores plus support oversight — defensible and audit-ready.

AI LABS

AI labs

Independent verification to stake a reputation on — and human-verified evaluation data to train on.

Why verification pays for itself

$67.4B
Global AI-error losses, 2024
15–25%
Enterprise error rate, unverified
<3%
Error rate with multi-engine + human verification
$14,200
Annual AI-error mitigation cost / employee
Certification. Independent third-party certification that a model meets Certainize accuracy thresholds — a trust signal no self-reported benchmark can replicate. Request a briefing →

Authentication

All API requests require an Authorization header with a Bearer token. API keys are issued when you subscribe via reallysolved.com/enterprise.

Authorization: Bearer ck_live_XXXXXXXXXXXX
Live vs. test keys. Keys prefixed ck_live_ make real API calls and count against your quota. Keys prefixed ck_test_ return fixture responses and do not count against your quota. Use test keys in your CI environment.

curl

bash
curl -H "Authorization: Bearer ck_live_XXXXXXXXXXXX" \
  https://api.certainize.ai/v1/brands/openai/score

Python

python
import requests

headers = {"Authorization": "Bearer ck_live_XXXXXXXXXXXX"}
r = requests.get(
    "https://api.certainize.ai/v1/brands/openai/score",
    headers=headers
)
data = r.json()
print(data["resolution_score"])

Node.js

javascript
const res = await fetch("https://api.certainize.ai/v1/brands/openai/score", {
  headers: { "Authorization": "Bearer ck_live_XXXXXXXXXXXX" }
});
const data = await res.json();
console.log(data.resolution_score);

Endpoints reference

Base URL: https://api.certainize.ai

Method Path Description
GET /v1/brands/{brand} Resolution Score for a brand
GET /v1/brands/{brand}/truths Active Truths against a brand
GET /v1/score-feed Live stream of all score changes
GET /v1/resolvers/{handle} Expert Solver public data
POST /v1/embed/badge Generate a badge programmatically
GET /v1/diff-feed Model-disagreement + human-verified resolution feed (Beta)
GET /v1/corpora/{vertical} Licensed vertical ground-truth corpus (Beta)

GET /v1/brands/{brand}

Returns the current Resolution Score and summary metadata for a brand. The brand path parameter is the brand's lowercase slug (e.g. samsung, openai, binance).

GET https://api.certainize.ai/v1/brands/{brand}
ParameterInTypeDescription
brand requiredpathstringBrand slug (lowercase). Use the brand's common name, e.g. samsung.
{
  "brand": "samsung",
  "display_name": "Samsung",
  "resolution_score": 73,
  "score_band": "moderate",
  "open_truths": 3,
  "resolved_truths": 41,
  "resolver_count": 22,
  "last_updated": "2026-05-24T14:22:00Z",
  "status": "active",
  "embed_badge_url": "https://reallysolved.com/api/embed/brand/samsung.svg"
}

Score bands

BandScore rangeMeaning
high80–100Strong Expert Solver consensus, few open Truths
moderate50–79Mixed signals; review open Truths for detail
low20–49Significant unresolved claims
critical0–19Active dispute, high open Truth count

GET /v1/brands/{brand}/truths

Returns the list of active Truths filed against a brand. Each Truth represents a specific factual claim that expert Expert Solvers are voting on.

GET https://api.certainize.ai/v1/brands/{brand}/truths
ParameterInTypeDescription
brand requiredpathstringBrand slug
statusquerystringFilter by status: open | resolved | all. Default: open
limitqueryintegerMax results per page. Default: 20, max: 100
cursorquerystringPagination cursor from previous response
{
  "brand": "openai",
  "truths": [
    {
      "id": "truth_8kxp3n",
      "claim": "OpenAI's GPT-4 safety evals were rushed before launch",
      "status": "open",
      "resolver_votes": 14,
      "consensus_pct": 71,
      "filed_at": "2026-05-10T09:14:00Z"
    }
  ],
  "next_cursor": "eyJpZCI6Ijg...",
  "total": 7
}

GET /v1/score-feed

Returns a chronological list of recent score change events across all brands. Use this endpoint to build monitoring dashboards or trigger alerts on score movements.

GET https://api.certainize.ai/v1/score-feed
ParameterInTypeDescription
sincequeryISO 8601 stringReturn events after this timestamp. Default: last 24 hours.
brandsquerycomma-separated stringFilter to specific brand slugs, e.g. samsung,openai
min_deltaqueryintegerMinimum absolute score change to include. Default: 1
limitqueryintegerMax results. Default: 50, max: 500
{
  "events": [
    {
      "brand": "binance",
      "score_before": 58,
      "score_after": 51,
      "delta": -7,
      "trigger": "truth_resolved",
      "truth_id": "truth_4mzq9w",
      "timestamp": "2026-05-24T13:05:00Z"
    }
  ],
  "total": 1
}

GET /v1/resolvers/{handle}

Returns public profile data for an Expert Solver — a human expert who votes on Truths. The handle is the Expert Solver's public username on reallysolved.com.

GET https://api.certainize.ai/v1/resolvers/{handle}
ParameterInTypeDescription
handle requiredpathstringExpert Solver's public handle, e.g. mkbhd
{
  "handle": "mkbhd",
  "tier": "Senior Solver",
  "riq": 8420,
  "specializations": ["consumer-tech", "ev"],
  "truths_resolved": 312,
  "accuracy_rate": 0.94,
  "badge_url": "https://reallysolved.com/api/embed/resolver/mkbhd.svg",
  "profile_url": "https://reallysolved.com/r/mkbhd"
}

POST /v1/embed/badge

Generates a badge configuration for programmatic embedding of Resolution Scores into your own UI.

POST https://api.certainize.ai/v1/embed/badge
FieldTypeDescription
brand requiredstringBrand slug
stylestringflat | compact | full. Default: flat
themestringdark | light. Default: dark
{
  "brand": "samsung",
  "embed_url": "https://reallysolved.com/api/embed/brand/samsung.svg?style=flat&theme=dark",
  "resolution_score": 73,
  "expires_at": "2026-05-24T15:22:00Z"
}

Badge embed endpoints (hosted on reallysolved.com)

These endpoints are hosted on reallysolved.com. Do not proxy or reimplement them. Reference them directly in your HTML or embed code.
Method URL Description
GET https://reallysolved.com/api/embed/resolver/{handle} Expert Solver profile badge (SVG)
GET https://reallysolved.com/api/embed/brand/{brandname} Brand Resolution Score badge (SVG)
GET https://reallysolved.com/api/embed/score/{topic-id} Topic score badge (SVG)
GET https://reallysolved.com/api/embed/*.svg Generic SVG badge retrieval
POST https://reallysolved.com/api/embed/track Track badge impression event

HTML embed example

html
<img src="https://reallysolved.com/api/embed/resolver/mkbhd.svg"
     alt="Expert Solver Badge: mkbhd · Senior Solver">

Model Eval Data — the live Model-Integrity benchmark

A continuously-updated, human-verified benchmark built on factual ground truth — not preference votes or "vibes." Every time leading models disagree, our human Expert Solvers verify the correct answer, leaving a running record of where today's models get it wrong, and what's actually right.

Public preference-leaderboards rank models on which answer feels better. This ranks them on what's verifiably true, case by case, refreshed as the models change — a live integrity signal for teams that ship in domains where a confident hallucination is a liability, not a vibe.

Beta — shaping this with early research partners. We intend to license anonymized, aggregated data to qualified AI-research partners under agreement — not a public dataset. Talk to us to help define it. Powered by ReallySolved™.

If your team trains, fine-tunes, or evaluates a specialized model — vertical SLM or frontier — this is the ground truth you need to train & verify it, and the data you currently pay human-labeling vendors to produce — except ours is generated organically by real usage, pre-filtered to the hard cases (genuine model disagreement), and verified by domain experts. We call it "Corrections & Results." In your stack it shows up as:

  • Evaluation sets — fresh, contamination-resistant, and naturally hard (every item is a real disagreement between leading models).
  • Preference / RLHF pairs — verified-correct vs. divergent-incorrect model outputs.
  • Hard-negative & error-mode mining — where a specific model fails, sliced by topic.
  • Hallucination / factuality labels — asserted-but-resolved-false instances.

Because we run the same prompt across several leading models, the data is cross-model comparative — a signal no single lab can generate from its own traffic alone.

GET /v1/diff-feed Beta

A chronological stream of disagreement events: a prompt, each model's stance, and the human-verified resolution. Content is anonymized; any named-person facts appear only as short attributed quotes. We intend to make it available to approved research partners under agreement.

GET https://api.certainize.ai/v1/diff-feed
ParameterInTypeDescription
sincequeryISO 8601 stringReturn events resolved after this timestamp. Default: last 24 hours.
verticalquerystringFilter to a domain, e.g. health, law, finance
min_models_disagreeingqueryintegerOnly include items where at least N models diverged. Default: 2
limitqueryintegerMax results. Default: 50, max: 500
{
  "events": [
    {
      "diff_id": "diff_9f2c7a",
      "prompt": "Is creatine monohydrate safe for adolescent athletes?",
      "vertical": "health",
      "model_answers": [
        { "model": "model-a", "stance": "not recommended" },
        { "model": "model-b", "stance": "safe at standard doses" },
        { "model": "model-c", "stance": "insufficient evidence" }
      ],
      "disagreement": true,
      "resolution": {
        "verdict": "Safe at standard doses for most adolescents; advise physician consult.",
        "verified_by": "expert_consensus",
        "resolver_count": 4,
        "confidence": 0.86,
        "citations": 2
      },
      "resolved_at": "2026-05-30T17:11:00Z"
    }
  ],
  "total": 1
}

Vertical ground-truth corpora Beta

License a domain-specific corpus of human-verified resolutions — the fuel for fine-tuning or evaluating a specialized model in a field where accuracy and provenance matter (health, law, finance). Specialized models live or die on trustworthiness; this is how you train one to be reliable & prove it — low hallucination, verifiable. Every record carries its verification trail and citations.

Two ways to work with us:

  • License the corpus — a de-identified, continuously-updated, provenance-tracked dataset for your vertical.
  • Co-build the model — we supply and keep refreshing the ground-truth; you bring the training. Talk to us about exclusivity.
Corpora are scoped per vertical and provided under a data-license agreement. Request early access →

Webhooks

Subscribe to real-time events by registering a webhook URL in your account settings at reallysolved.com/enterprise. Events are sent as HTTP POST requests with a JSON body.

Webhook delivery includes an X-Certainize-Signature header (HMAC-SHA256 of the request body using your webhook secret) so you can verify authenticity.
silence_clock.fired Fires when a brand's grace period expires
{
  "event": "silence_clock.fired",
  "timestamp": "2026-05-24T16:00:00Z",
  "data": {
    "brand": "binance",
    "grace_period_started": "2026-05-17T16:00:00Z",
    "open_truths": 5,
    "resolution_score": 44,
    "action_required": true
  }
}
score.changed Fires when a Resolution Score changes by more than the configured threshold
{
  "event": "score.changed",
  "timestamp": "2026-05-24T13:05:00Z",
  "data": {
    "brand": "binance",
    "score_before": 58,
    "score_after": 51,
    "delta": -7,
    "trigger": "truth_resolved",
    "truth_id": "truth_4mzq9w"
  }
}

Rate limits

Rate limits are enforced per API key. Exceeding the limit returns HTTP 429 Too Many Requests with a Retry-After header.

Tier Requests / minute Requests / month
Starter ($2,500/mo) 60 1,000,000
Growth ($8,500/mo) 300 10,000,000
Institutional ($25,000/mo) Unlimited Unlimited

For the full rate card and custom volume pricing, see reallysolved.com/enterprise.

Content scope

API responses include brands, products, services, and AI system claims. Expert Solver votes and Truth filings cover commercially and technically verifiable claims.

The following are not included per platform policy: political opinions, election claims, government policy debate, classified or insider information.

See reallysolved.com/policies/terms for the complete content policy.

Changelog

v1.0.0
2026-05-24
Initial public documentation release. Verification APIs endpoints documented: /v1/brands/{brand}, /v1/brands/{brand}/truths, /v1/score-feed, /v1/resolvers/{handle}, /v1/embed/badge. Webhooks, rate limits, and content scope documented.

Status

Status page coming soon.

To subscribe to status updates and incident notifications, email status@certainize.ai.