Audit tool: checks, scoring & teaser – White Tree Digital Docs

The audit's internals are a deliberately small, pure pipeline: fetch every raw signal once into a SiteData object server-side, run a registry of independent checks as pure functions over it, fold the results into one 0–10 score, then squeeze that into a three-finding teaser that is the only shape allowed across the wire.

This page covers the data model and the three decoupled layers — the Check contract, the CHECKS registry, the impact-weighted score, and the lossy teaser gate. For the request flow, rate limiting, PSI key, and HubSpot hand-off, see Audit tool: architecture & flow.

The whole thing lives in website/src/components/audit/. The original rebuild plan that shaped it is prompts/claude-code-prompt-audit-signals.md — its core insight: stop authoring at the site level. You never enumerate "all possible site outcomes." You author each check once, with copy for its own states; the page is just whichever checks fired. That kills the combinatorial problem and makes adding a signal cheap.

The Check model

A Check is a pure function over SiteData. SiteData is fetched once, server-side (in the Cloudflare Worker), and holds every raw signal — the PSI response, the parsed HTML, the response headers, robots/sitemap reachability, and the rendered-request host list. No check fetches anything; each just derives a status from data that is already in hand.

From check.ts:

export type Status = 'pass' | 'warn' | 'fail' | 'unknown';

export type CheckGroup = 'speed' | 'tracking' | 'security' | 'visibility' | 'stack';

export interface Check {
  id: string;
  group: CheckGroup;
  /** 1-10. Drives BOTH score weight and teaser ranking — one knob, no separate severity. */
  impact: number;
  /** Line items this expands into in the email report (drives the locked count + rollup). */
  subIssues: number;
  reportOnly?: boolean;
  /** Pure; derives status from already-fetched SiteData. Must not fetch or throw. */
  run(site: SiteData): CheckResult;
  /** 'pass' copy is optional — used only in the email report, never on the page. */
  copy: Partial<Record<Status, CheckCopy>>;
}

A CheckResult is just {status, value?}, where value is the measured number or string ("3.1 s", "2.4 MB", a count) used for copy interpolation via the {value} placeholder.

Four properties of this model carry their weight:

impact is one knob. A single 1–10 number drives both the score weight and the teaser ranking. There is no separate severity field — a high-impact check that fails both drags the score down and bids hardest for one of the three teaser slots.
subIssues is the locked count rendered behind the wall as a "(lock) N issues" badge and the input to the email report's rollup math. It is the number of line items this one signal expands into in the full report.
reportOnly checks (today, just platform) still run and feed the email report but can never occupy a teaser slot on the page. platform is context ("Built on WordPress") — useful for tailoring recommendations, never a finding.
copy is partial by status. pass copy is optional and is email-report-only — it is never shown on the page. The page is a pain surface by design; reassurance lives in the email.

A check must be pure — it must not fetch or throw

run(site) derives status from already-fetched SiteData. It must not make a network call, and it must not throw. The orchestrator (audit.ts) wraps each run in try/catch, so a throwing check does not blow up the audit — but it silently degrades to unknown, which is dropped from the score and never surfaced. A throw is a lost check, not a visible error. Keep run total: return {status: 'unknown'} for the "couldn't tell" case rather than letting an access blow up.

The orchestrator is tiny — runChecks in audit.ts is pure mapping with a safety net:

export function runChecks(site: SiteData): RanCheck[] {
  return CHECKS.map((check) => {
    try {
      return {check, result: check.run(site)};
    } catch {
      return {check, result: {status: 'unknown' as const}};
    }
  });
}

Status, banding, and "unknown" as graceful degradation

unknown is the load-bearing fourth status. Every source in SiteData degrades to null/false on failure (PSI timed out, the HTML GET was blocked, a probe broke), and a check whose inputs are missing returns unknown. That status is excluded entirely from the score and is never surfaced on the page — "We couldn't check X" reads as the tool being broken, so it simply isn't shown.

PSI-derived speed checks band a Lighthouse 0–1 score at Lighthouse's own cutoffs, via bandScore / psiBand in checks/helpers.ts:

export function bandScore(score: number | null): Status {
  if (score === null) return 'unknown';
  if (score >= 0.9) return 'pass';
  if (score >= 0.5) return 'warn';
  return 'fail';
}

psiBand adds the right-way nuances: PSI not run or audit key absent → unknown; scoreDisplayMode === 'notApplicable' (nothing to fix) → pass; informative/manual audits with no pass-fail meaning → unknown. This handles Lighthouse 12, where many old "opportunity" audits were renamed to *-insight keys carrying real 0–1 scores (hence render-blocking-insight, image-delivery-insight).

The CHECKS registry — single source of truth

registry.ts is the one place that declares which checks exist. It simply spreads the per-group arrays:

export const CHECKS: Check[] = [
  ...speedChecks,
  ...trackingChecks,
  ...securityChecks,
  ...visibilityChecks,
  ...stackChecks,
];

Order is irrelevant — both the score and the selection sort by impact, so the array's order never affects output. To add a signal, you add a Check object to its group file (checks/<group>.ts) and export it in that file's array. Nothing else changes: the registry picks it up by spread, runChecks runs it, and the score and teaser include it automatically. There is no central list to keep in sync, no renderer branch to add.

The site currently runs roughly two dozen checks across the five groups (SiteData is fetched once; the registry has well over three checks, which is why the teaser is always able to fill three slots).

The five check groups

Each group file exports an array of Checks. The grouping is both an organizing principle and the group label that shows on each teaser finding.

Group	File	What it reads	Example checks
speed	`checks/speed.ts`	The existing PSI (Lighthouse 12) response — no extra fetch	`lcp`, `inp`, `cls`, `ttfb`, `page-weight`, `render-blocking`, `image-optimization`
tracking	`checks/tracking.ts`	Static HTML ∪ rendered PSI request hosts	`analytics`, `ad-pixels`, `tag-manager`, `consent`
security	`checks/security.ts`	HTTP→HTTPS probe, response headers, mixed-content scan	`https`, `mixed-content`, `security-headers`
visibility	`checks/visibility.ts`	Parsed HTML signals + robots/sitemap reachability	`title-tag`, `meta-description`, `open-graph`, `structured-data`, `canonical`, `sitemap-robots`, `viewport`, `heading-structure`, `image-alt`
stack	`checks/stack.ts`	HTML/header fingerprints	`platform` (reportOnly), `cdn`, `outdated-tech`

A few group-specific behaviors worth knowing:

Speed reuses the PSI response the audit already fetched — it never makes a second call. inp prefers real field INP from CrUX (loadingExperience.metrics.INTERACTION_TO_NEXT_PAINT) and falls back to lab Total Blocking Time so it is actually assessed on low-traffic sites instead of being perpetually unknown.
Tracking unions two detection sources — the static HTML haystack and the hostnames of every request the rendered page made (PSI's network-requests audit). The rendered list is what makes detection verifiable: it catches tags injected by GTM or a CMS (HubSpot's cookie banner via js.hs-banner.com, for instance) that never appear in source.
Security: https treats a successful PSI run as proof HTTPS loads (PSI audits the https:// URL). It returns unknown only when neither HTML nor PSI could load, and it refuses to over-fail a working HTTPS site when just the redirect probe breaks.
Stack: outdated-tech is deliberately conservative. It only flags a positively-detected old major version (e.g. jQuery ≤ 2.x), and its copy must say "running an older version," never "vulnerable / exposed."

GTM hides what a static fetch can't see — never assert absence

A static-only fetch (PSI didn't run) cannot see what GTM injects at runtime. So when GTM is present in the static HTML but there's no rendered request list, the ad-pixels and consent checks return unknown, not a false warn — they must not claim a pixel or consent banner is absent when GTM could be loading one. When PSI did render the page, its request host list is authoritative and absence is real (then warn is honest). The unit tests pin both directions of this.

The score: impact-weighted 0–10

computeScore in score.ts folds the ran checks into a single number. Each non-unknown check contributes impact × points[status], where points are pass = 1, warn = 0.5, fail = 0. The result is the impact-weighted fraction scaled to 10, rounded to one decimal:

const points = { pass: 1, warn: 0.5, fail: 0 };  // POINTS in check.ts
// score = 10 × Σ(impact·points[status]) / Σ(impact)  over ran checks
return Math.round(((10 * weighted) / total) * 10) / 10; // 1 decimal

Two design choices:

unknown is excluded from numerator and denominator. A check that couldn't run doesn't drag the score down and doesn't dilute it either — it simply isn't part of the math.
The score carries the severity signal on its own. A site full of high-impact failures lands near 2 without the page having to list twenty cards. This is why the page can show a fixed three findings and still communicate "disaster" vs "minor" purely through the number.

If no checks ran (everything was unknown, e.g. the site was unreachable for PSI and the HTML GET), the score is 0.

Teaser selection: status-then-impact, fixed top 3, rollup

select.ts's selectTeaser decides what reaches the page. The page shows exactly three findings, always — scale is communicated by the score plus a rollup count, never by adding cards. A clean site shows a high score with a small rollup; a disaster shows a low score, the three worst findings, and a big "+N more." Same layout, neither underselling.

The algorithm:

Drop reportOnly checks (they can't surface).
Take the fails and warns, and sort them fail-before-warn, then highest impact first:

const statusRank = (s) => (s === 'fail' ? 0 : s === 'warn' ? 1 : 2);
issues.sort(
  (a, b) =>
    statusRank(a.result.status) - statusRank(b.result.status) ||
    b.check.impact - a.check.impact,
);

totalIssues = Σ subIssues over all fails+warns. shown = Σ subIssues over the three teaser cards. rollup = totalIssues − shown — the "+N more in the full report" count, hidden when 0.

Selection.fallback is 'none' | 'next-level' | 'ahead':

3+ real issues → take the top 3, fallback: 'none'.
Fewer than 3 issues (good-site fallback) → never manufacture problems. Backfill the remaining slots from the weakest (lowest-impact) passes, reframed as "next-level," fallback: 'next-level'. The pass cards don't count toward shown, so the rollup math stays honest.
Zero issues at all → still show three cards, but switch the framing to fallback: 'ahead' ("ahead of most sites").

Why backfill from passes instead of showing fewer cards

The layout is fixed at three so a strong site and a broken site read identically in structure — the only difference is the score and rollup. Showing one card on a good site would look like a different, lesser result. Backfilling from the weakest passes keeps three cards while honestly reframing them as upsell opportunities, never as invented problems.

The teaser gate — lossy by design (a security property)

buildTeaser in teaser.ts is the wire boundary. It runs computeScore + selectTeaser, then maps each selected RanCheck into a TeaserFinding. The gate lives in the type. A TeaserFinding deliberately carries no id, status, value, fix, or pass copy — only the four safe keys:

export interface TeaserFinding {
  group: string;
  headline: string;
  line: string;
  subIssues: number; // locked count rendered as "<lock> N issues"
}

The full RanCheck[] — with statuses, measured values, check IDs, and the email-report detail — stays inside the Worker and never crosses the wire. A technically literate visitor reading the raw network response sees the symptom and a locked count, never the gated detail. That detail is the product: it's delivered in the emailed report after the form submit. (Only teaser-level data is submitted to HubSpot too — the rich Phase-2 report detail never reaches the client.)

The mapping also interpolates copy: {value} is replaced with the measured value, or stripped cleanly when there is no value. For a pass card surfaced via the good-site fallback, buildTeaser falls back to generic "quick win" copy if the check has no pass copy of its own — so a teaser pass slot never renders an empty line.

The assembled AuditTeaser is what the client receives:

export interface AuditTeaser {
  score: number;        // 0–10, 1 decimal
  host: string;
  fallback: Fallback;
  findings: TeaserFinding[]; // exactly 3 in production
  rollup: number;       // 0 → UI hides the "+N more" line
  totalIssues: number;
}

This gate is a security boundary — don't widen TeaserFinding

The lossiness is the point, and it is enforced by a unit test treated as a security regression. checks.test.ts asserts that every finding has exactly the keys ['group', 'headline', 'line', 'subIssues'], that the serialized teaser matches none of "status" | "value" | "id" | "fix", that no pass-copy headline leaks into a fail-dominant teaser, and that no literal {value} placeholder survives. Adding a field to TeaserFinding — even a seemingly harmless id or status "for the UI" — leaks gated detail to the client and breaks these tests. Author new UI affordances from the four safe keys, not by widening the wire shape.

What the unit tests guard

The pure, deterministic design exists so the whole pipeline is unit-testable against hand-authored SiteData fixtures — no network, in a plain Node env. checks.test.ts pins the invariants this page describes:

The gate is lossy — TeaserFinding has only the four safe keys; no status/value/id/fix and no pass-copy leak into a fail-dominant teaser; {value} is always interpolated away.
GTM-injection honesty — a static-only fetch must not assert pixels/consent are absent when GTM is present (returns unknown).
Rendered request hosts are authoritative — the HubSpot-banner-via-js.hs-banner.com case detects consent that never appears in static HTML.
A throwing check degrades to unknown — the runChecks contract.
Selection rules — fixed three cards, reportOnly never enters the teaser, and the good-site fallbacks (next-level / ahead).

Where this lives

Concern	File
Check / CheckResult / Status / impact / POINTS	`website/src/components/audit/check.ts`
The `CHECKS` registry (single source of truth)	`website/src/components/audit/registry.ts`
Pure orchestration (`runChecks`, throw→unknown)	`website/src/components/audit/audit.ts`
Impact-weighted 0–10 score	`website/src/components/audit/score.ts`
Teaser selection (status-then-impact, top 3, rollup, fallbacks)	`website/src/components/audit/select.ts`
Wire gate (`TeaserFinding`, `AuditTeaser`, `buildTeaser`)	`website/src/components/audit/teaser.ts`
`SiteData` shape + the single parallel fetch	`website/src/components/audit/siteData.ts`
Banding / PSI helpers (`bandScore`, `psiBand`, `htmlHaystack`, `hostsMatch`)	`website/src/components/audit/checks/helpers.ts`
Check groups	`website/src/components/audit/checks/{security,speed,stack,tracking,visibility}.ts`
Invariant tests	`website/src/components/audit/checks.test.ts`
Original rebuild plan	`prompts/claude-code-prompt-audit-signals.md`

For how SiteData is fetched, the /api/audit endpoint, rate limiting, the PSI key secret, the disposable-email guard, and the HubSpot submit, see Audit tool: architecture & flow. For the HubSpot form properties the teaser maps onto, see HubSpot & lead capture.