mumo← Back

mumo MCP Server

Run structured multi-model deliberations directly from any MCP-compatible client — Claude Code, Cursor, Claude Desktop, Windsurf, ChatGPT.

The MCP server wraps the REST API. Same auth, same artifacts, same field names. Use this if you want your agent to invoke deliberations as tool calls; use the REST API if you're integrating from your own backend.

Setup

Prerequisites

  1. A mumo account at mumo.chat.
  2. A platform API key — create one at mumo.chat/settings/api-keys. Keys begin with mmo_live_.
  3. An MCP-compatible client.

Claude Code

Install the mumo plugin from Anthropic's official marketplace:

/plugin marketplace add anthropics/claude-plugins-official
/plugin install mumo

Bundles the MCP server config and an auto-triggering skill. Before installing, export your key:

export MUMO_API_KEY=mmo_live_YOUR_KEY_HERE

Restart Claude Code after install to pick up the new tools.

<details> <summary>Manual install (skip the skill)</summary>
claude mcp add --transport http mumo https://mumo.chat/api/mcp \
  --header "Authorization: Bearer mmo_live_…"

Verify with claude mcp list. Restart any running Claude Code session to pick up the change.

</details>

Claude Cowork

Open Claude Desktop → Cowork tab → CustomizeBrowse plugins → search for mumoInstall. Or browse the full catalog at claude.com/plugins.

Export your key the same way as Claude Code:

export MUMO_API_KEY=mmo_live_YOUR_KEY_HERE

Restart Cowork after install.

Claude Code and Cowork have separate plugin panels backed by different marketplaces (claude-plugins-official vs knowledge-work-plugins). Installing in one doesn't auto-install in the other — install in each separately.

Cursor

Install the mumo Cursor plugin — bundles the MCP server, an auto-triggering skill, and a Cursor rule. Once listed on the Cursor Marketplace (submission pending), search for mumo and click Install. Until then, sideload from GitHub:

git clone https://github.com/mumo-chat/mumo-cursor ~/.cursor/plugins/local/mumo
export MUMO_API_KEY=mmo_live_YOUR_KEY_HERE   # in your shell, before launching Cursor

Restart Cursor after install.

Invocation. Cursor's rule system treats plugin rules and skills as soft priors — auto-trigger on contested decisions is best-effort. For reliable routing, name mumo explicitly in your prompt: "ask mumo about…", "run this by a mumo panel", "get me a second opinion from mumo." Ambiguous phrasings like "ask a panel" or "what do different models think" may route to a generic response instead of this plugin.

Or manual mcp.json install (no skill/rule):

{
  "mcpServers": {
    "mumo": {
      "url": "https://mumo.chat/api/mcp",
      "headers": {
        "Authorization": "Bearer ${env:MUMO_API_KEY}"
      }
    }
  }
}

VS Code (GitHub Copilot)

Install the mumo extension from the Visual Studio Marketplace. Extensions panel → search mumoInstall. Requires VS Code 1.101 or later.

On your first MCP tool call in Copilot Chat (Agent mode), the extension prompts for your mumo API key — create one at mumo.chat/settings/api-keys. Stored in VS Code's SecretStorage (OS-native keychain). No MUMO_API_KEY env-var export required.

Invocation. v0.1.0 registers MCP tools only; the auto-triggering skill shipped in mumo-mcp / mumo-cursor isn't bundled here. Invoke mumo explicitly in Agent chat: "ask mumo about…", "run this by a mumo panel", "get me a second opinion from mumo."

<details> <summary>Manual mcp.json install (no extension)</summary>

Run the MCP: Open User Configuration command and paste:

{
  "servers": {
    "mumo": {
      "url": "https://mumo.chat/api/mcp",
      "requestInit": {
        "headers": {
          "Authorization": "Bearer mmo_live_…"
        }
      }
    }
  }
}

VS Code uses servers (not mcpServers) and nests auth headers under requestInit.headers — a different schema from Claude Desktop / the generic mcp.json in the Others tab. Tool access without the extension's native key storage; you manage the key in the config file directly.

</details>

Codex

Edit ~/.codex/config.toml:

[mcp_servers.mumo]
url = "https://mumo.chat/api/mcp"
headers = { Authorization = "Bearer mmo_live_…" }

Other clients (VS Code, Windsurf, Claude Desktop, Cline, Zed, …)

Any MCP-compatible client that supports HTTP transport with a custom Authorization header will work. Point it at https://mumo.chat/api/mcp with Authorization: Bearer mmo_live_….

Most clients accept a generic mcp.json:

{
  "mcpServers": {
    "mumo": {
      "url": "https://mumo.chat/api/mcp",
      "headers": {
        "Authorization": "Bearer mmo_live_…"
      }
    }
  }
}

For Claude Desktop (general chat, not Cowork), this file lives at ~/Library/Application Support/Claude/claude_desktop_config.json on macOS. Each client documents where its own mcp.json belongs — check your client's docs.


Your first call

The simplest way to verify everything is wired up is a single-shot deliberation. From your agent:

Create a deliberation: "Should we use Postgres or MongoDB for an event store?", rounds: 1.

The agent invokes create_deliberation. The call returns in under a second with a session id, round index, and progress URL — but the models are still running. Ask the agent to call get_session until the round reports complete (typically 15–120s depending on model choice). Then:

Session id abc-123, status ready.
3 models responded. claim_map.claims[] has 4 entries with cross-model positions.
distill.narrative summarizes where they agreed and disagreed.

The loop: tool call (fast ack) → poll → claim map + distill → agent reads → next decision.


Tools

Five tools — call signatures match the REST API field-for-field.

ToolPurpose
create_deliberationStart a new deliberation.
append_roundAdd a follow-up round with steering snippets.
get_sessionFetch full session state.
list_sessionsList your sessions, optionally filtered.
list_modelsSee models available to your tier.

create_deliberation

Two modes, distinguished by whether moderator_model is set:

  • No moderator_modelremote. You drive subsequent rounds via append_round with steering snippets. The session is open-ended — append as many rounds as you want.
  • moderator_model setautonomous. The AI moderator runs the full multi-round arc unattended. The rounds cap controls how long it runs.

Both modes return an ack immediately and run the models in the background. Call get_session to read the round's artifacts once it reports complete. If you only want one round (no follow-up), just don't call append_round — the first round already contains everything.

Inputs:

  • prompt (required) — the question or topic.

  • reference — optional doc, spec, or design. Injected as shared context for all models.

  • models — array of 2–3 model IDs. Defaults to platform selection. Call list_models to enumerate.

  • rounds — 1–14. Default 3. Only meaningful in autonomous mode — caps the moderator's arc. In remote mode, append_round is unbounded by this value.

  • moderator_model — model ID to moderate autonomously. Omit for remote mode.

  • moderator_name — display name for the steering identity (e.g., your agent's name). Surfaces in the published transcript.

  • application — display name of the client (e.g., "Claude Code"). Surfaces in the session info panel.

  • distill — opt into distill artifacts. MCP default is off — programmatic consumers get high-signal artifacts (responses + claim map) without the extra LLM-call tax. Accepts a shorthand string or a structured object:

    • "off" — both artifacts disabled (MCP default)
    • "brief" — only the structured JSON brief (narrative + agreements + disagreements + continuation.recommendation)
    • "summary" — only the streaming narrative prose
    • "both" — both artifacts enabled
    • { brief: boolean, summary: boolean } — fine-grained control

    If your agent is surfacing output to a human user, pass "summary", "brief", or "both" on create_deliberation depending on what your human reads better:

    • "summary" — flowing narrative prose.
    • "brief" — structured: narrative, agreements, disagreements, plus the continuation.recommendation signal (stop / continue / explore). Some readers scan structured points faster than prose.
    • "both" — both artifacts.

    The defaults are tuned for programmatic consumption (responses + claim map are the highest-signal artifacts for that use case, and the human-readable fields are expensive to generate per round).

Returns: an ack — session_id, round_index, and a progress_url. Models run in the background; read the session (rounds, claim maps, distills) via get_session once the round is complete. The session's mode will be "single_shot" for rounds: 1 requests, "remote" otherwise, "autonomous" when moderator_model is set. All three share the same engine.

append_round

Add a follow-up round to a remote-mode session. Use after reading get_session to steer the next round.

Inputs:

  • session_id (required) — session ID from a prior create_deliberation.
  • prompt (required) — the steering prompt for this round.
  • snippets — array of typed cross-model forwards. Each:
    • typeKEEP | EXPLORE | CHALLENGE | CORE | SHIFT
    • quote — verbatim quote from a prior round's response
    • quoted_model — model ID that said it
    • comment — optional commentary explaining why you're forwarding it
  • moderator_name — supply only when the steering identity changes mid-session.

Snippets are the highest-signal way to direct attention. Models see them as curated forwards from the moderator, with the snippet type shaping how they respond.

Returns: same ack shape as create_deliberation. Poll get_session for the new round's artifacts.

Errors:

  • The session must be in status: "ready". If it's streaming or processing, wait. If failed, you can't continue.
  • Round-append duplication corrupts deliberation history — the MCP server passes a derived idempotency key automatically based on the call args.

get_session

Fetch the full state of a session — all rounds, responses, snippets, claim maps, distills, and editorial summary.

The most useful field for downstream decisions: rounds[].claim_map.claims[]. Each claim has:

  • quote — the verbatim claim
  • originator — model that said it
  • positions[] — the cross-model reactions, each with model, type (KEEP/CHALLENGE/etc), and comment
  • reaction_count — how many models reacted

This is the highest-signal view of where the panel agrees and where they're stuck.

For autonomous sessions, poll until status: "ready".

list_sessions

List the caller's sessions, optionally filtered.

Inputs:

  • statusready | streaming
  • modeautonomous | remote | single_shot (label-only variant of remote)
  • limit — 1–200 (default 7)
  • offset — pagination

Returns a lightweight list (no response bodies). Useful for agents managing concurrent sessions — status: "ready" finds sessions awaiting your next round.

list_models

Returns model id, provider, display name, context window, max output tokens, and pricing. Call before create_deliberation if the user wants specific models.


Snippet types

The five buckets — KEEP / EXPLORE / CHALLENGE / CORE / SHIFT — are the steering primitive. Each carries a different framing into the next round's prompt:

BucketFraming
KEEP"This resonates with me."
EXPLORE"Let's go deeper on this."
CHALLENGE"I'm not sold on this."
CORE"This is what it comes down to."
SHIFT"This shifted my perspective."

The model receiving a snippet sees the framing — it's not a neutral forward. Use them deliberately. Concrete example:

Round 1: GPT proposes per-seat pricing for the enterprise tier. Claude proposes usage-based. The agent reads round 1, decides per-seat is the weaker option for early pilots. Round 2 append: prompt "Resolve the pricing model. Pick one." plus two snippets: • CHALLENGE on GPT's "per-seat assumes teams of >10," with comment "most pilots start at 3–5" • KEEP on Claude's "usage-based aligns incentives" Models read those framings and converge on usage-based with a per-seat fallback for >25.


Confidence scores

When models emit self-reported confidence tags in their prose, those scores surface on responses:

  • rounds[].responses[].claim_confidence — per-claim scores
  • rounds[].responses[].snippets[].comment_confidence — per-snippet-comment scores
  • confidence_disclaimer — short advisory string

These are self-reported and not calibrated across models. Surface the disclaimer if you display them to users.


Identity metadata

Both create_deliberation and append_round accept two optional identity fields:

  • moderator_name — display name of who/what is steering the deliberation. Shown in the session info panel and replaces "You" attribution in the transcript. On append_round, supply only when the steering entity changes (e.g., a human takes over from an agent) — otherwise the existing value is preserved.
  • application — display name of the client driving the session (e.g., "Claude Code", "Cursor"). Shown in the session info panel only. Only meaningful on create_deliberation.

Sessions opened through MCP are tagged source: "mcp" server-side. Neither field is auto-populated — pass whatever your client wants to display.


Errors

The MCP server returns errors as text content with a structured prefix. Common cases:

  • session_busy — a round is in flight. Wait, then retry.
  • daily_limit_reached — round budget exhausted. Returns resets_at.
  • forbidden_model — requested a model your tier can't access. Returns the available alternatives.
  • not_found — session ID doesn't exist or isn't yours.

The full REST error reference is in the API docs.


Naming philosophy

The MCP tools accept and return the same field names as the REST API. Field names you see in tool inputs and outputs are the canonical contract — they're what mumo guarantees to consumers.

One convention to know: snippet types (type field on append_round snippets and claim_map.claims[].positions[]) are always UPPERCASEKEEP, EXPLORE, CHALLENGE, CORE, SHIFT.

distill returns a full structured brief: key_finding, agreements[], disagreements[], impactful_quote, open_questions[], narrative, and continuation. Most agents read continuation.recommendation to decide whether to call append_round or stop; narrative is for human display.

continuation is the steering signal:

{
  "convergence": 0.78,                 // 0.0–1.0, trajectory-aware
  "recommendation": "stop",            // "stop" | "continue" | "explore"
  "reasoning": "..."                   // cites this round's evidence
}
  • stop — panel is converged; another round would be churn.
  • continue — unresolved tensions another round could resolve.
  • explore — productive new territory worth deepening, even if convergence is lower.

convergence is non-monotonic across rounds — it can drop when a round opens new productive disagreement. That's a signal, not a regression.

Nullability: the structured fields are string | null, string[] | null, etc. null means structured distill wasn't computed (legacy path or upstream failure). An empty array means the model produced no entries — that's authoritative. Don't treat agreements: [] and agreements: null the same way.

Full internal mapping is in docs/CONVENTIONS.md — useful if you're contributing or auditing.


See also