Surf Harness

This page is a work in progress

This document describes the agent-facing runtime surface inside Surf.

Environments define where work happens. Workflows define how work happens. The Harness defines what an agent can see and do while work is happening.

Surf is not tied to a single agent harness. It can build on top of mainstream harnesses such as Codex, Claude Code, or OpenCode while adding the environment-aware and workflow-aware runtime surface that Surf provides.

The Harness exposes a Step-scoped tool surface that understands the repo, the environment, and the active Run.

Tools

Part of the Surf moat is the extra, extremely powerful, tools that are available to an agent during a Run.

Browsers/Computer Use

The Harness exposes browser and computer-use tools for frontend work. They understand Surf services and service context so an agent can open a service-backed page, apply stable login credentials, capture screenshots, inspect console and network activity, and interact with the UI without rebuilding that context manually.

This is where Surf feels very different from a generic browser automation layer. open_page("frontend", "/settings") is much stronger than raw browser navigation because the Harness already knows which service is the frontend, what URL it is running on, and what login material is available for it. That lets the agent spend its time verifying the product instead of reconstructing the environment.

Schema-aware API Querying

If a service exposes openapi(...) or graphql(...) through .context(...), the Harness exposes first-class tools for inspecting and calling that surface. These tools use the service’s base URL and auth context automatically when possible.

The important part is not just having an HTTP client. It is having an API tool that already understands the service it is talking to. call_api("authApi", "createToken", ...) is much better than a raw fetch(...) because the Harness already knows the schema, the base URL, and the auth shape for that service. The agent is reasoning about the contract, not rebuilding request wiring every time.

Environment and service tools

Agents can inspect and control the running environment directly. The Harness exposes tools for listing services, reading service status, resolving typed outputs, viewing logs, restarting services, and inspecting recent failures or health information.

This is another place where context-aware tools matter a lot. restart_service("worker") is better than hoping the agent can find the right restart command in a shell script or process manager. query_postgres("db", "...") is better than shelling into psql and reconstructing connection details by hand. The Harness lets the agent operate at the level of named services and typed resources, not opaque hostnames and commands.

Knowledge & Memories

The Harness exposes tools for searching and retrieving durable repo and org knowledge, including architecture notes, internal conventions, service docs, onboarding material, past decisions, and stored memories from earlier Runs.

The value here is not generic search alone. It is that the Harness can blend repo knowledge, environment context, service context, and retained memory together. A Workflow Step can ask for the information that matters to the current mission instead of dumping a pile of unrelated markdown into the prompt.

Semantic Code Search

The Harness exposes semantic code search tools that help agents find relevant code by meaning, not just by exact text match. That makes it much easier to answer questions like where authentication happens, how a workflow is enforced, or which code path writes to a given database table.

This matters because large repos are full of concepts that are hard to discover with string matching alone. Semantic search gives the agent a faster way to build a mental model of unfamiliar code, trace behavior across files, and find the implementation points that matter to the current task.

Structured Git

Git moves out of raw shell access and into structured tools. The Harness exposes operations such as status, diff inspection, branch switching, staging, commit creation, and PR creation or update in a way that is more predictable and auditable than plain shell commands.

This is better not because git is impossible in bash, but because it is easy for agents to use sloppily there. Structured git tools let Surf control the sharp edges, make operations easier to audit, and keep the agent focused on intent instead of command syntax.

MCP

Surf supports org-provided MCP tools and external integrations through the same Step-scoped model as the rest of the Harness.

That matters because MCP does not feel bolted on. A repo’s external tools participate in the same permission model, Workflow scoping, and context model as the rest of Surf’s Harness.

Step and Run tools

The Workflow engine itself exposes tools through the Harness. These include attaching Artifacts to the current Step, marking Step outcomes, submitting structured issues, requesting loopback, and marking a Run blocked with a reason.

This is one of the most important differentiators. submit_review_issues([...]) is much better than hoping an agent formats review output correctly in plain text. Once the Harness exposes Run-aware tools, Workflow state stops being advisory and becomes enforceable.

Base harness tools

Surf also builds on top of the underlying coding-agent harness tools, such as file reads, code edits, search, and patch application, rather than replacing them outright.

The goal is not to remove the strong generic coding tools that already exist. It is to wrap them in a richer environment so they become part of a more capable system.