Skip to content

Security & Trust Model

Specs are user-submitted code blueprints that will be executed on your machine. That’s an inherently dangerous surface. This page explains every layer of protection SpecMarket provides — and is honest about what it doesn’t.

The Threat Model

When you run specmarket run @someone/their-spec, you’re telling an AI agent to read that spec’s instructions and execute them. A malicious spec could instruct the agent to:

  • Exfiltrate files from your machine to an external server
  • Install cryptocurrency mining software
  • Disable security software
  • Access credentials stored on your system
  • Run destructive commands (rm -rf, database drops)

SpecMarket mitigates these risks at four levels: publish-time scanning, community flagging, runtime sandboxing, and telemetry transparency.

Layer 1: Publish-Time Scanning

Every spec is scanned before it’s listed on the marketplace. When an author runs specmarket publish, a Convex action sends the spec’s PROMPT.md, SPEC.md, and all stdlib/ files to an LLM-based security scanner.

The scanner checks for:

  • Data exfiltration instructions — Any instruction that sends data to an external server, uploads files, or makes network requests to non-standard destinations
  • Cryptocurrency mining code — Instructions to install or run miners
  • Security software disabling — Instructions to disable firewalls, antivirus, or system protections
  • Obfuscated or encoded payloads — Base64-encoded commands, hex-encoded strings, or other obfuscation that hides intent
  • Contradictory instructions — A spec that claims to build a form builder but includes instructions for accessing SSH keys
  • Prompt injection — Attempts to manipulate the security scanner itself into approving malicious content

Specs that fail the security scan are rejected with a specific explanation of what was flagged. Authors can fix the issue and resubmit.

What this catches: Most straightforward malicious specs. An obvious “send ~/.ssh/id_rsa to evil.com” will be caught.

What this doesn’t catch: Sophisticated attacks that use indirect instruction, social engineering of the AI agent, or instructions that are benign in isolation but malicious in combination. LLM-based scanning is not a cryptographic guarantee.

Layer 2: Community Flagging

Any authenticated user can flag a spec they believe is malicious or misleading.

specmarket report @author/spec-name --reason "Spec instructions include exfiltration of .env files"

How flagging works:

  1. Any logged-in user can flag a spec with a reason
  2. After 3 independent flags, the spec is automatically hidden from search results and the explore page
  3. Hidden specs remain accessible by direct URL (so existing users aren’t broken) but display a prominent warning
  4. Platform admins review flagged specs and either clear the flags or permanently remove the spec

Flagging is rate-limited (20 actions per user per day) to prevent abuse. Filing false flags repeatedly will result in account review.

Layer 3: Runtime Sandboxing

This is the most important layer. If you don’t trust a spec’s author, run it in a sandbox.

specmarket run @someone/untrusted-spec --sandbox docker

When sandboxed, the Ralph Loop runs inside a Docker container with:

  • No network access except to the Anthropic API (for the AI agent)
  • No access to your host filesystem except the designated output directory
  • No access to your credentials, SSH keys, environment variables, or browser sessions
  • Resource limits on CPU, memory, and disk space
  • A clean filesystem that starts fresh for every run

The spec can only write to /workspace inside the container. When the run completes, the output is copied to your local directory.

Other Sandboxing Options

  • VM sandbox: Run in a dedicated virtual machine for maximum isolation
  • Separate user account: Create a system user with minimal permissions and run specs as that user

When to Sandbox

  • Always sandbox specs from authors you don’t know
  • Always sandbox specs you’re running for the first time
  • Consider sandboxing any spec, period — the performance overhead is minimal

You can set sandboxing as your default:

specmarket config set default-sandbox docker

Layer 4: Telemetry Transparency

If you opt in to telemetry, SpecMarket tracks run metadata to help the community evaluate spec quality. Here’s exactly what is and isn’t collected.

What IS collected (opt-in only)

  • Spec ID and version
  • Model used (e.g., Claude Sonnet 4)
  • Total tokens consumed
  • Total cost in USD
  • Wall-clock build time
  • Final status (success, failure, stalled)
  • Which success criteria passed and which failed
  • Operating system and Node.js version
  • CLI version

What is NEVER collected

  • Source code — Not a single line of the code generated by the spec
  • File contents — No file on your machine is read or transmitted
  • Environment variables — No secrets, API keys, or credentials
  • File paths — Not even the names of files on your machine
  • Network traffic — No record of what the agent accessed during the build

Verifying What Was Sent

After any run, inspect the exact telemetry payload:

specmarket report latest

This shows you word-for-word what was (or would be) submitted. Nothing is transmitted without your ability to inspect it first.

Opting Out

Telemetry is off by default. You must explicitly enable it:

specmarket config set telemetry true

You can disable it at any time:

specmarket config set telemetry false

Deleting Your Data

Delete all telemetry data associated with your account:

specmarket config delete-telemetry

This removes all run reports from the platform. The spec’s aggregate statistics (success rate, average cost) are recalculated without your data.

Authentication & Account Security

  • All authentication is handled by Clerk. SpecMarket never stores passwords.
  • CLI tokens expire after 30 days and refresh automatically.
  • Admin operations require server-side role verification. The CLI cannot escalate privileges.

Payment Security

  • All payments are handled by Stripe. No credit card data touches SpecMarket’s servers.
  • Bounty payouts use Stripe Connect. Creators onboard as connected accounts directly with Stripe.
  • All payment amounts are validated server-side. Client-submitted amounts are never trusted.

Rate Limiting

To prevent abuse:

  • Run submissions: Max 10 per user per hour
  • Spec publishing: Max 5 per user per day
  • Ratings: Max 20 per user per day
  • Search: Max 100 requests per minute per IP

Rate limits are enforced server-side in Convex. They cannot be bypassed from the CLI.

What We Can’t Protect Against

Transparency means acknowledging our limits:

A sufficiently motivated attacker could craft a spec that passes our scanner. LLM-based scanning is probabilistic, not deterministic. It’s one layer of defence, not a guarantee.

We can’t verify what the AI agent does at runtime outside a sandbox. Without Docker isolation, the agent has the same permissions as your user account. If the spec tells it to read a file, it will try.

Community flagging requires a community. Early in the platform’s life, there may not be enough users to flag malicious specs quickly. We compensate for this with more aggressive publish-time scanning.

Specs can be updated. An author could publish a safe v1.0 and then push a malicious v1.1. Version-specific run data and the security scanner on every publish mitigate this, but if you pinned a spec and auto-update, be aware.

Our advice is simple: use the Docker sandbox for any spec you haven’t personally reviewed. The performance overhead is 10-15%. The safety margin is worth it.

Reporting Vulnerabilities

If you discover a security vulnerability in SpecMarket (the platform, not in a specific spec):

  • Email security@specmarket.dev with details
  • Include steps to reproduce if possible
  • We’ll acknowledge within 24 hours and provide a timeline for a fix

For malicious specs, use the CLI flag command or report via the web UI.