Compute Efficiency as Value

Published March 18, 2026, by Frans

In the attention economy, engagement is the product. In the agent economy, compute efficiency is one.

The advertising-funded internet has a peculiar economic property: the marginal cost of one more user interaction is approximately zero. Serving one more pageview costs fractions of a cent. This is why attention became the currency: it costs almost nothing to harvest, and it can be sold to advertisers at scale.

The agent internet doesn't have this property, and ideally never will.

Every agent interaction consumes real, metered compute: LLM inference to understand the page, browser orchestration to interact with it, network round-trips to load resources, memory to maintain context. These aren't negligible costs. They're the primary cost center of agent operations. And they vary dramatically based on how the target service is designed.

A service that requires 15 DOM interactions, 3 page loads, and 2 LLM extraction passes to complete a purchase consumes orders of magnitude more compute, than, say a service that accepts a single structured request such as via agent.json or x402 and returns a confirmation. Same outcome. Radically different cost.

This means compute efficiency is no longer just an engineering concern, it's a direct driver of economic value in the agent ecosystem. The more efficiently a service can be consumed by agents, the more valuable it is to everyone: the consumer who pays for compute, the agent that orchestrates the task, and the provider who receives the traffic.

The True Cost of a Task

In the human web, the cost of a user completing a task is borne by the user in time and the service in infrastructure. A human spending 10 minutes comparison-shopping costs the service a few pageviews. The human's time is "free" to the service; in fact, it's monetizable via advertising.

In the agent web, the cost structure inverts:

Cost Component	Human Web	Agent Web
Time	Human's time (unpaid, infinite supply)	Compute time (metered, finite budget)
Inference	Human brain (free)	LLM calls ($0.01–$0.10+ per interaction)
Rendering	Browser is free	Browser context is a shared, limited resource
Navigation	Human clicks are free	Each page load costs compute + latency
Extraction	Human reads with eyes (free)	Extraction requires inference or parsing (costs compute)
Error recovery	Human adapts instantly	Recovery requires additional compute cycles

Every row in that table is a cost that scales with the complexity of the interaction. A bloated, poorly-structured website doesn't just annoy a human user, it burns compute for every agent that touches it.

This creates a direct economic link between service design and ecosystem cost:

Inefficient service design
    → More compute per task
        → Higher cost to consumers
            → Lower routing preference
                → Less agent traffic
                    → Less revenue for the provider

The inverse is equally true:

Efficient service design
    → Less compute per task
        → Lower cost to consumers
            → Higher routing preference
                → More agent traffic
                    → More revenue for the provider

Compute efficiency isn't a nice-to-have. It's the mechanism that determines who wins agent flow on the internet.

The Compute Efficiency Spectrum

Not all agent-service interactions are created equal. They exist on a spectrum of compute efficiency:

Level 1: Raw DOM Automation (Highest Cost)

The standard quo in March 2026. The agent navigates the human-facing website, loads full pages with all assets, parses the DOM, identifies interactive elements, clicks buttons, fills forms, waits for state changes, and extracts results from rendered HTML.

Compute profile:

Multiple full page renders (CSS, JS, images, ads, trackers)
LLM inference to interpret page structure and identify elements
Multiple sequential interactions with wait times between each
Error-prone — UI changes break selectors, requiring recovery cycles
Typical task: 10–30 seconds, 5–15 LLM calls

This works, and it's how most agent-web interaction happens today. But it's the most expensive way to accomplish any given task.

Level 2: Declared Capabilities (Medium Cost)

The provider publishes a capability manifest, so the agent knows what's available without exploring. Interaction still happens through the DOM, but the agent doesn't waste compute on discovery — it knows exactly what to do and which elements to target.

Compute profile:

Fewer exploratory page loads (agent knows the path)
Reduced LLM inference (structure is declared, not inferred)
Still requires DOM interaction for execution
More resilient — capability descriptions survive minor UI changes
Typical task: 5–15 seconds, 2–8 LLM calls

Level 3: Structured Data Exchange (Low Cost)

The provider accepts structured requests and returns structured responses. The agent doesn't need to render a page, parse HTML, or click through a UI. It sends a request with parameters and gets back a result. Effectively API as they exist in March 2026, but slightly more verbose and designed for agents.

Compute profile:

No page rendering
Minimal or no LLM inference (data is already structured)
Single request-response cycle
Highly reliable — structured contracts rarely break
Typical task: 1–3 seconds, 0–1 LLM calls

Level 4: Agent-Native Endpoints (Lowest Cost)

The service is designed from the ground up for agent consumption. Endpoints are optimized for common agent workflows, responses include exactly the information agents need (no more, no less), and error handling is designed for programmatic consumers.

Compute profile:

Single optimized request
Zero LLM inference
Sub-second response
Near-zero failure rate
Typical task: under 1 second, 0 LLM calls

The difference between Level 1 and Level 4 is often 100x (or more depending on task complexity that a service is handling) in compute cost for the same outcome. That's not an optimization, that's a structural advantage.

Compute as a Utility

Here's the framework shift that matters for SaaS providers:

In the attention economy, your product is the user's time. You're incentivized to make your service sticky, engaging, and time-consuming. A user who spends 30 minutes on your platform is more valuable than one who spends 30 seconds, because you have more attention to monetize.

In the agent economy, your product is compute-efficient task completion. A service that completes a task in 200ms and one API call is more valuable than one that takes 30 seconds and 12 page loads, because the compute saved is real money that the consumer doesn't have to spend and the agent runtime doesn't have to burn.

This is the "compute as a utility" model: every agent interaction has a measurable cost in compute units, and services that minimize that cost are providing tangible value to the ecosystem.

Think of it like electricity. A factory that builds widgets using less electricity per widget has lower production costs and can offer better prices. In the agent economy, a service that resolves tasks using less compute per completion has lower operating cost for the ecosystem and becomes the preferred routing target.

What This Means for SaaS

For SaaS providers, compute efficiency has always mattered for margins; fewer servers, lower cloud bills. But it was invisible to the end user. The user didn't know or care whether your backend resolved their query in 50ms or 500ms as long as the page loaded.

In the agent economy, compute efficiency becomes externally visible and economically rewarded:

Faster resolution → Lower latency → Better quality signals → More agent traffic
Fewer round-trips → Less compute consumed → Lower cost to consumers → Higher routing preference
Structured responses → No extraction overhead → Agent can act on results immediately
Predictable behavior → Fewer retries and error recovery cycles → Higher success rate

A SaaS provider that exposes well-designed, compute-efficient agent endpoints isn't just saving their own infrastructure costs, they're reducing costs across the entire interaction chain. That efficiency is rewarded with higher routing priority and more completions, which directly translates to more revenue.

The Relevance Premium

Compute efficiency has a second dimension beyond speed: relevance. An agent that's trying to complete a complex, context-specific task like, "find me a CRM that integrates with our existing Salesforce setup, supports our team size, and fits within our Q3 budget" needs to evaluate multiple services against multiple criteria.

A service that returns:

A full product catalog of 10,000 items for the agent to filter

...is compute-inefficient. The agent has to burn LLM inference to parse, filter, and evaluate a massive response.

A service that returns:

The 3 products that match the specified criteria, with relevance scoring

...is compute-efficient. The response is already scoped to what the agent actually needs.

This is the relevance premium: services that can provide contextually relevant responses, not just raw data dumps, save enormous amounts of downstream compute. The more complex and context-specific the task, the more valuable relevance becomes.

For SaaS providers, this means that the same capabilities that make your product valuable to human users, domain expertise, intelligent filtering, contextual recommendations, are even more valuable in the agent economy, because they substitute for expensive LLM inference.

Your domain knowledge is compute efficiency. Your product expertise is compute efficiency. Your ability to understand context and return relevant results instead of raw data is compute efficiency.

The Compound Effect

Compute efficiency compounds across the three-party ecosystem:

At the provider level: An efficient provider completes tasks faster, which means more completions per unit time, which means more revenue. They also receive preferential routing because agent runtimes optimize for cost-effective task completion.

At the agent level: An agent that routes to efficient providers completes tasks faster and cheaper, which means its users get better outcomes at lower cost, which means the agent attracts more users.

At the consumer level: When the whole stack is efficient — agent picks the right provider, provider responds with structured data, no wasted page loads or extraction passes — the consumer gets their outcome in seconds instead of minutes, at a fraction of the compute cost.

At the ecosystem level: As providers compete on compute efficiency (because it's directly rewarded), the aggregate cost of agent-web interaction decreases. This makes more tasks economically viable for agent delegation. More delegated tasks means more agent traffic. More traffic means more revenue for efficient providers. The flywheel accelerates.

This is the opposite of the attention economy's dynamic, where services compete to consume more of your time. In the agent economy, services compete to consume less of the ecosystem's compute. The incentives finally align with the user's interest: get it done, fast and cheap.

What Providers Should Do

1. Measure Your Compute Footprint

How many page loads, DOM interactions, and LLM inference passes does it take for an agent to complete a typical task on your service? This is your compute footprint. If you don't know, the answer is "too many."

2. Reduce Round-Trips

Every page load is expensive. Every DOM interaction requires browser compute. Design your service so that common agent tasks require the fewest possible interactions. A single-page checkout is more compute-efficient than a five-step wizard.

3. Return Structured, Scoped Data

When agents request information, return exactly what they need in a structured format. Don't make them extract meaning from HTML. Don't return 10,000 results when 10 are relevant. Structure and scope are compute efficiency.

4. Leverage Your Domain Expertise

Your product knowledge, filtering algorithms, and recommendation engines aren't just features — they're compute-efficiency multipliers. An agent that can ask "which plan fits a 50-person team with SSO requirements" and get a direct answer is saving orders of magnitude in compute compared to parsing your pricing page.

5. Declare Capabilities

Publish a capability manifest so agents don't waste compute discovering what you offer. Discovery is expensive — every exploratory page load is compute that could have been skipped if you'd simply declared your capabilities. The open agent.json standard provides a structured way to do this. For how capability declaration fits into the broader discovery model, see From SEO to AEO.

The Bottom Line

The attention economy rewards services for being engaging, sticky, and time-consuming. The agent economy rewards services for being fast, structured, and compute-efficient.

This isn't a marginal optimization. It's a fundamental inversion of incentives. For the first time, a service's technical efficiency, how cleanly it can resolve a task, how little compute it wastes, how relevantly it responds to context-specific requests, can directly determines its economic success on the internet.

SaaS providers who understand this will design their products not just for human usability, but for agent consumability. The two aren't in conflict: a well-structured, fast, relevant service is good for both humans and agents. But the agent economy makes these qualities directly monetizable in a way the attention economy never did.

Compute efficiency is no longer an engineering metric. It's a business model. For the full economic framework, see Economics of Agent Commerce.

The agent.json specification is open source and designed to be a shared standard — not controlled by any single runtime or platform. View on GitHub →

Supervision as a Service The Agency Problem

All papers