Docs

Quickstart

Install JudgmentKit for your MCP client, then connect to the hosted Streamable HTTP endpoint.

curl -fsSL https://judgmentkit.ai/install | bash
curl -fsSL https://judgmentkit.ai/install | bash -s -- --client claude
curl -fsSL https://judgmentkit.ai/install | bash -s -- --client cursor

Codex is the default client. Use --client codex, --client claude, or --client cursor when scripting.

First 10 Minutes

Use the replayable first-use fixture to see the AI-native design system as a contract loop, not a renderer. The fixture gives the agent one brief, one implementation contract input, one failing candidate, one repaired candidate, and the expected two-attempt transcript.

examples/ai-native-design-system/first-use.json
examples/ai-native-design-system/canonical-examples.json

Loop: create the implementation contract, review the failing candidate, read next_agent_action and grouped repair_instructions, repair the candidate, then resubmit and expect accept.

Canonical cases: setup/onboarding, operational dashboard, and high-stakes review/refund workflow. Each case includes the activity model, implementation contract input, failing candidate, repaired candidate, and proof expectation.

Renderer boundary: the runtime renderer/component package can remain deferred, but implementation_contract.design_system_source, implementation_contract.local_component_authority, implementation_contract.visual_token_adapter, and implementation_contract.default_ai_native_design_system are active implementation contract authorities. A complete design_system_adapter can select external_design_system; missing authorities fail instead of falling back to JudgmentKit defaults.

Planning Mode Examples

Use these examples to review whether an agent is using JudgmentKit well. A good planning response should make the activity, decision, outcome, and disclosure boundary clearer before it proposes UI structure.

Ready brief

Plan a UI for a support lead reviewing refund requests during daily triage. They decide whether each case is approved, sent to policy review, or returned for missing evidence. The outcome is a clear handoff with the next action and reason.

Good response: proceed to concept planning because the activity, participant, decision, and outcome are clear. Keep the plan centered on evidence review, decision options, and handoff.

Accept: approval, policy review, return for evidence, and handoff reasons are easy to compare and complete.

Reject: charts, widgets, or visual polish appear before the refund review work is named.

Vague brief

Plan a dashboard for the system.

Good response: pause instead of inventing a dashboard. Ask only targeted questions about the activity, primary decision or next action, and outcome.

Accept: the agent asks what work the dashboard supports, what decision it should make easier, and what the user should leave knowing or having done.

Reject: a full dashboard plan with metrics, cards, charts, and navigation invented from no source context.

Implementation-heavy brief

Plan an admin UI from our JSON schema, database tables, tool call traces, prompt template, and API endpoints.

Good response: treat schemas, tables, traces, prompts, and endpoints as diagnostic details unless the task is explicitly setup, debugging, auditing, or integration work. Translate toward the user's activity before proposing a primary surface.

Accept: implementation terms move into diagnostics and the agent asks for the domain activity or decision behind the admin surface.

Reject: tables, schemas, prompt templates, tool calls, or API endpoints become the main product UI.

MCP

JudgmentKit supports MCP through the hosted Streamable HTTP endpoint at https://judgmentkit.ai/mcp. The installer registers that endpoint as judgmentkit in Codex, Claude Code, or Cursor. A browser GET to /mcp returns endpoint metadata; MCP clients should connect to the same URL with Streamable HTTP.

MCP tool responses include structuredContent as the stable machine-readable contract and content[0].text as a concise Markdown planning card for Codex-style planning chat. Use the card to explain status, next step, blocking questions, and compact diagnostics; use structured content for implementation decisions and follow-up tool calls.

System Map

Use JudgmentKit before generation and across iterations. It is the contract and review layer around the LLM or agent, not the final UI renderer.

MCP boundary: agents call JudgmentKit tools through MCP; MCP is access and transport, not the LLM.

JudgmentKit kernel: deterministic review, candidate review, disclosure rules, targeted questions, and the handoff gate decide whether UI generation is ready.

LLM / provider seam: a model may propose activity or workflow candidates, but JudgmentKit reviews those candidates before trusting them.

Surface type: recommend_surface_types classifies activity purpose as marketing, workbench, operator review, form flow, dashboard monitor, content/report, setup/debug tool, or conversation before frontend implementation guidance.

UI generation: the LLM or agent generates the interface outside JudgmentKit from the reviewed handoff.

Implementation contract: create_ui_implementation_contract supplies implementation_contract.design_system_source, implementation_contract.local_component_authority, implementation_contract.visual_token_adapter, implementation_contract.default_ai_native_design_system, approved primitives, required states, static checks, browser QA expectations, implementation_contract.visual_asset_policy, and implementation_contract.accessibility_policy before final handoff. review_ui_implementation_candidate checks generated UI against that contract and marks failed design-system candidates as repair-only diagnostics, not accepted artifacts.

Frontend adapter: create_frontend_generation_context combines a ready handoff, selected surface type, project frontend context, and verification expectations. create_frontend_implementation_skill_context turns that ready context into portable implementation instructions, semantic token roles, system font stacks, Lucide icon catalog policy, design-system provenance expectations, and local component authority without exposing raw skill files. Design-system compliance is not a substitute for activity fit.

Slide decks: create_slide_deck plans or exports JudgmentKit presentation-theme decks from user-facing slide content. Hosted callers can use dry-run planning; PPTX export requires a local artifact runtime.

Iteration: draft review produces updated context that re-enters source/activity review rather than becoming only a longer prompt.

Blocked path: if activity, workflow, or handoff is not ready, resolve targeted questions or leakage details before generating UI.

Activity Review

Call create_activity_model_review before generating UI from a brief. Use the returned candidate only when the activity, participant, decision, outcome, and disclosure boundary are clear enough.

Workflow Review

Call review_ui_workflow_candidate before accepting an agent-proposed workflow. It checks source grounding, action support, completion or handoff clarity, and leakage containment.

Cognitive Dimensions Review

Call review_cognitive_dimensions_candidate when a workflow or implementation candidate needs review for domain mapping, evidence near action, hidden dependencies, premature commitment, progressive evaluation, change cost, memory-heavy transitions, or disclosure leakage. Findings are diagnostic guidance for agents and reviewers; do not copy Cognitive Dimensions terminology into product UI.

Surface Type

Call recommend_surface_types after activity review and before workflow or frontend implementation guidance. Surface type is activity-purpose guidance, not a visual theme.

Handoff

Call create_ui_generation_handoff only on a ready workflow review. If the gate blocks, resolve the targeted questions or leakage details first.

Implementation Contract

Call create_ui_implementation_contract before final handoff so generated UI has approved primitives, state coverage, implementation_contract.design_system_source, implementation_contract.local_component_authority, implementation_contract.visual_token_adapter, implementation_contract.default_ai_native_design_system, static checks, browser QA expectations, implementation_contract.visual_asset_policy, and implementation_contract.accessibility_policy. Call review_ui_implementation_candidate before accepting generated UI code or evidence. Visual-heavy pages need browser-rendered contrast/readability evidence for text over images, canvas, WebGL, video, gradients, or generated visuals.

Frontend Context

Call create_frontend_generation_context after the handoff gate when an agent needs frontend implementation guidance with selected surface type, project context, and verification expectations. Call create_frontend_implementation_skill_context when an MCP client needs compiled implementation skill guidance instead of repo-local skill access.

Slide Decks

Call create_slide_deck when an allowed brief, workflow review, handoff, or implementation evidence should become a JudgmentKit presentation, PowerPoint, or PPTX. The tool returns selected templates and content keys in dry-run mode, and writes PPTX artifacts only from a local @oai/artifact-tool runtime under the guarded output directory.

Guidance Profiles

Call recommend_ui_workflow_profiles when a brief sounds like specialized review work. Pass profile_id: "operator-review-ui" only when the recommendation evidence supports it.