Wrap your MCP server — verifying-proxy quickstart
You already operate an MCP server that answers questions from your own sources. The Restormel verifying proxy sits in front of it, calls your tool, and checks every claim in
the answer against the sources your server cited — returning a VerifiedEnvelope in which each claim is marked supported, unverified, or abstain. You bring the server; Restormel does the verification.
This guide gets you from an existing server to a printed envelope without writing any code.
pnpm exec tsx scripts/reviews/verifying-proxy-reference.ts and
you will see two verified envelopes print. Everything below explains how to point the same runner
at your server.connect.retrieve_verified).What's available today
The proxy core is live on main: the MCP client leg, the
verification engine, and a reproducible reference runner you can point at any Mode-1 server right
now. The hosted multi-tenant route (a registered /mcp endpoint with
OAuth) is Wave 2 and is noted as "coming" below. Until it ships, the
integration path is the reference runner described here.
Prerequisites
- An MCP server you control that exposes at least one tool returning a Mode-1 answer (see the contract below). If your server is a GraphRAG-style retriever that returns a synthesised answer plus the passages it drew from, it already qualifies.
- A checkout of the Restormel repo with workspace dependencies installed
(
pnpm install). The reference runner is a repo script, not a published binary, while the hosted route is in Wave 2. - Node.js ≥ 18 and
pnpmavailable in your shell. - Optional — a validator API key for a real entailment judge (for example
OPENAI_API_KEY). The bundled stub validator needs no key and is fine for a first run.
Step 1 — make your server speak the Mode-1 contract
The verifying proxy is a Mode-1 proxy. Your MCP tool must return a JSON object
with this shape, serialised as a single text content block:
{
"answer": "A synthesised answer to the query.",
"claims": [
"Claim one, as a discrete, checkable sentence.",
"Claim two."
],
"sources": [
{
"id": "source-1",
"text": "The full verbatim passage that grounded this answer.",
"uri": "https://example.com/doc#section"
}
]
} | Field | Required | Notes |
|---|---|---|
answer | Yes* | The synthesised answer. Used as a single implicit claim when claims is absent. |
claims | Recommended | Explicit decomposed claims. Each is verified independently. Either answer or a non-empty claims must be present. |
sources[].id | Yes | Stable identifier for this source within the response. |
sources[].text | Yes | The verbatim passage the claim is grounded against. The proxy binds claim spans into this text and hashes it. |
sources[].uri | Optional | A URL for the original document. Carried into the envelope for provenance display. |
text field of a content[0] block
with "type": "text". A tool that returns structured or binary
content instead of a text block is out of scope for v1 and routes every claim to review — it is
never silently coerced.A working reference fixture lives at packages/mcp/src/proxy/fixtures/mode1-upstream.ts. It exposes
exactly this shape over a small bundled corpus under the tool name graph_answer (the exported MODE1_TOOL_NAME). Use it as a template for your own tool's output.
Step 2 — run the bundled fixture (no server, no keys)
The reference runner at scripts/reviews/verifying-proxy-reference.ts is the reproducible entry point. With no arguments it links the proxy client to the bundled fixture
over the MCP SDK's in-memory transport and runs two queries through the full verification pipeline:
pnpm exec tsx scripts/reviews/verifying-proxy-reference.ts No credentials are required: the stub validator is deterministic. Use this to confirm your checkout
works before pointing at a real server. Expected output is two envelopes — one SUPPORTED claim and one planted, non-entailed claim per query.
Step 3 — point the proxy at your own server
Pass --upstream to call your server instead of the fixture. Two
transports are supported.
Over stdio (subprocess)
If your server speaks MCP over stdio, the runner spawns it as a subprocess:
pnpm exec tsx scripts/reviews/verifying-proxy-reference.ts \
--upstream stdio:node ./path/to/your-server.js \
--validator openai:gpt-4o-mini The form is --upstream stdio:<command> [args…]: the command
follows the stdio: prefix and every following non-flag token is
passed as an argument. This maps to connectUpstreamStdio in packages/mcp/src/proxy/client.ts. (The bundled fixture works here
too: --upstream stdio:tsx packages/mcp/src/proxy/fixtures/mode1-upstream.ts.)
Over StreamableHTTP (remote)
If your server is reachable over HTTP, start it and pass its URL:
pnpm exec tsx scripts/reviews/verifying-proxy-reference.ts \
--upstream https://your-mcp-host.example.com/mcp \
--validator openai:gpt-4o-mini The URL is validated against an SSRF block-list before any connection is opened: private IPs,
loopback (outside dev), link-local, cloud metadata (169.254.169.254),
and .internal hostnames are rejected, and production requires https/wss. To exercise the HTTP path
locally, the repo ships a reference server:
# terminal 1 — start the local Mode-1 HTTP server (listens on :3741)
pnpm exec tsx packages/mcp/src/proxy/fixtures/mode1-http-server.ts
# terminal 2 — point the runner at it
pnpm exec tsx scripts/reviews/verifying-proxy-reference.ts \
--upstream http://localhost:3741/mcp \
--validator openai:gpt-4o-mini Choose a validator independent of your answer author
--validator <family>:<model> selects the cross-model
entailment judge. Its family must differ from the family of the model that wrote
your upstream answers — a model judging its own output is not a faithfulness check. If the proxy
cannot guarantee independence, it fails closed: every claim abstains and routes to
review. The runner reads the key from the environment:
| Family | Example | Key env var |
|---|---|---|
openai | openai:gpt-4o-mini | OPENAI_API_KEY |
anthropic | anthropic:claude-haiku-4-5-20251001 | ANTHROPIC_API_KEY |
together | together:meta-llama/Llama-3.3-70B-Instruct-Turbo | TOGETHER_API_KEY |
google | google:gemini-2.0-flash | GOOGLE_API_KEY or GEMINI_API_KEY |
Omitting --validator uses the deterministic stub — useful only for
smoke-testing the pipeline, not for a real faithfulness verdict. Add --k <n> to draw n self-consistency samples per
entailment check.
Step 4 — read the verified envelope
Each query produces one VerifiedEnvelope. The runner prints a block per claim:
── Query: Who built the first Eddystone lighthouse and when?
legs_ms: callTool=12 quote_retrieval=820 judge_entailment=1210 layer1_bind=2
validator=openai:gpt-4o-mini restormel_cost={calls:4, chars:3820}
[SUPPORTED ] entailed bound(exact) hash=a3f8c1d2e5b7…
claim: The first lighthouse on the Eddystone Rocks was completed in 1698 by Henry Winstanley.
[ABSTAIN ] not_entailed no_evidence
claim: Henry Winstanley's lighthouse still stands on the Eddystone Rocks today. The VerifiedEnvelope schema
The envelope is { claims: EnvelopeClaim[]; meta: EnvelopeMeta }.
The canonical types are in packages/connect-core/src/proxy/types.ts. Each claim is an EnvelopeClaim:
| Field | Type | Meaning |
|---|---|---|
claim | string | The text of the claim being verified. |
status | "supported" | "unverified" | "abstain" | The fail-safe outcome — see the status table. |
binding.status | "bound" | "unbound" | "no_evidence" | Layer-1 result: bound when a verbatim span was located in a cited source; otherwise unbound (quote_not_found) or no_evidence (extractor_returned_no_quote). |
binding.span.quote | string | When bound: the verbatim quote located in the source text. |
binding.span.start / .end | number | [start, end) character offsets of the span into the original source text. |
binding.span.match | "exact" | "normalized" | "fuzzy" | How strictly the quote matched. Anything looser than exact is labelled, never hidden. |
entailment.verdict | "entailed" | "not_entailed" | "abstain" | Layer-2 cross-model entailment of the claim against its bound span. |
entailment.confidence | number | null | Validator confidence in [0,1], or null on the abstain path. |
entailment.note | string? | Optional reason on a fail-safe abstain (e.g. coverage_gap: no verdict returned). |
source_ref.id | string | The id of the cited source the span was bound against (or (none) when nothing was cited). |
source_ref.uri | string? | The source URI your server supplied, if any. |
source_ref.source_hash | string | SHA-256 of the source text at verification time. Reference-by-hash — no source bytes are stored. |
The envelope's meta (an EnvelopeMeta) carries run-level attribution:
| Field | Type | Meaning |
|---|---|---|
validator_model | string | null | Resolved validator model id, e.g. openai:gpt-4o-mini; null on the stub / fail-closed path. |
judged_at | string | ISO 8601 timestamp the envelope was produced. |
legs_ms | Record<string, number> | Per-leg latency (quote_retrieval, judge_entailment, layer1_bind); callTool is folded in by the runner. See latency and cost. |
restormel_cost | { calls: number; chars: number } | The proxy's own validator spend — not your upstream's model spend. |
The status table (fail-safe)
The only path to supported is a bound span whose entailment verdict is entailed at or above the low-confidence floor:
| Binding | Entailment | Status | Meaning |
|---|---|---|---|
bound | entailed (confidence ≥ floor) | supported | Claim is grounded in a verbatim source span and entailed by it. |
bound | not_entailed | unverified | A span was found but the claim is not entailed by it. |
| Anything else | Any | abstain | No span, validator error, timeout, low-confidence, or missing verdict. |
abstain is the fail-safe outcome. An error, a timeout, or a missing
verdict is never mapped to supported. A
low-confidence entailed verdict (below the EBV floor) also routes to abstain, not supported. Claims that
abstain or are unverified go to review — they are not silently passed through.
Latency and cost
Verification adds two validator round-trips over a bare upstream call — quote retrieval and entailment — and these dominate the added latency regardless of how fast your upstream is. The runner reports four legs:
| Leg | What it measures |
|---|---|
callTool | Proxy client → your upstream MCP server (your server's round-trip). |
quote_retrieval | Validator call to retrieve verbatim candidate quotes from the cited source text. Zero when your server already supplies quotes. |
judge_entailment | Validator call to judge entailment of each claim against its bound span. |
layer1_bind | Layer-1 deterministic bind/hash (string operations — effectively free). |
Measured targets are placeholders to be earned, not guarantees (REC-PLAN-007):
roughly p50 ≈ 1.5 s / p95 ≈ 4 s added overhead with a small fast validator. Run the reference
runner against your own server and validator to get numbers for your setup. The restormel_cost field is the proxy's own validator spend (zero in
stub mode; typically a few thousand characters per query with a real validator at temperature 0).
(claim, span, source_hash, validator); raise the abstention
threshold to skip low-stakes claims; use a fast small validator for quote retrieval. These are
optimisations — measure first.Hosted multi-tenant proxy (coming — Wave 2)
The hosted /mcp route — where you register your upstream endpoint
and Restormel proxies it over OAuth 2.1 / PKCE with per-tenant isolation — is Wave 2 (Phase C) and not yet available. It covers per-workspace upstream
registration, the egress allow-list / SSRF guard for user-supplied URLs, a request-scoped BYO-key
validator with independence enforcement, and tenant isolation. Until it ships, the integration path
is the reference runner above: point --upstream at your server and
consume the printed envelopes, or adapt the runner for your own harness. The verification engine and
the MCP client leg are on main and stable.
Engineering reference
| Artefact | Path |
|---|---|
| MCP client leg (egress, SSRF guard, Mode-1 parse) | packages/mcp/src/proxy/client.ts |
Mode-1 upstream fixture (graph_answer) | packages/mcp/src/proxy/fixtures/mode1-upstream.ts |
| Local Mode-1 HTTP server (StreamableHTTP) | packages/mcp/src/proxy/fixtures/mode1-http-server.ts |
Verification façade (verifyEnvelope) | packages/connect-core/src/proxy/verify-envelope.ts |
Envelope types (VerifiedEnvelope, EnvelopeClaim) | packages/connect-core/src/proxy/types.ts |
| Reference runner | scripts/reviews/verifying-proxy-reference.ts |
| Engineering deep-dive (same content, repo copy) | docs/guides/verifying-proxy-quickstart.md |
Next steps
- Run the fixture —
pnpm exec tsx scripts/reviews/verifying-proxy-reference.ts— then point--upstreamat your own server. - Verified context — what
supportedmeans, the EBV layers, the fail-safe gates, and how to audit a claim yourself. - MCP verified-context quickstart — the other
direction: a Restormel Connect graph exposed as the
connect.retrieve_verifiedMCP tool in your AI client. - Context-regression CI — gate pull requests
on graph quality with
keys connect eval. - API reference — the Connect v1 endpoints behind the verification engine.