Perplexity Search as Code: Agents Write Search — 85% Fewer Tokens

Abhishek GautamAbhishek Gautam11 min read
Perplexity Search as Code: Agents Write Search — 85% Fewer Tokens

Quick summary

Perplexity replaced fixed search API calls with sandboxed Python pipelines agents write themselves — 288K tokens down to 43K on a CVE hunt benchmark.

Perplexity launched Search as Code (SaC) in early June 2026 — an architecture where AI agents write Python search pipelines in a sandbox instead of chaining fixed API tool calls. On a 200 high-severity CVE case study, Perplexity reports 100% accuracy at 42,900 tokens vs 288,700 tokens for its old pipeline — an ~85% reduction — while non-Perplexity systems scored below 25%.

It is live in the Agent API and default in Perplexity Computer.

How Search as Code Works

Traditional agent search:

  1. Model calls a search tool
  2. Raw results flood the context window
  3. Model filters in token space — expensive and error-prone

SaC flips the loop:

  1. Model generates Python using Perplexity's search SDK primitives (retrieve, filter, dedupe, rerank)
  2. Code runs in a secure sandbox against Perplexity's search backend
  3. Only final structured results return to the model context

Perplexity's research article "Rethinking Search as Code Generation" (June 2026) argues function calling and MCP force serial round trips that bloat prompts with intermediate junk.

Benchmark Claims (Company-Reported — Verify in Prod)

MetricSaC (Perplexity)Prior pipeline / rivals
CVE vendor advisory task accuracy100%<25% (non-Perplexity systems cited)
Tokens on same task42.9K288.7K (~85% drop)
DeepSearchQA score0.871Anthropic managed agents 0.815 (per Perplexity)

WANDR, a new in-house benchmark, ships in coming weeks per company blog posts.

Our Analysis: FinOps Lesson for Every Agent Builder

This lands the same week GitHub Copilot switched to token billing and Sam Altman said enterprise AI budgets exploded.

1. Filter in code, not in prompts

If your agent reads 500 search snippets into Claude/GPT context, you pay for 500 snippets every retry. SaC's lesson: push dedupe/rank into deterministic Python — same philosophy as SQL before LLM in RAG pipelines.

2. SDK primitives > monolithic tools

Expose composable retrieval functions so the model writes vendor-specific CVE templates once, then fan out parallel queries — Perplexity's exact CVE example.

3. Skepticism budget

All benchmarks are Perplexity-reported until third parties reproduce. Treat 85% as directional — still directionally aligned with Uber token caps pain.

4. Python-only runtime (for now)

Enterprise teams on TypeScript agents need wrappers or wait for SDK ports — factor that into stack choices.

5. GEO + citation play

Perplexity traffic already hits abhs.in via AI referrals. Posts with structured CVE numbers and FAQ blocks are exactly what SaC-style agents hunt — double down on Key Takeaways + definition-first H2s.

Track live costs: LLM API Pricing.

Key Takeaways

  • June 2026: Perplexity Search as Code — agents write Python search pipelines in sandbox
  • CVE case study: 100% accuracy, 42.9K vs 288.7K tokens (~85% savings) — company-reported
  • Live in: Agent API + default Perplexity Computer
  • Problem solved: MCP/function-calling bloat and serial tool round trips
  • For developers: move filter/rank out of LLM context into code; design composable search SDKs
  • What to watch: WANDR benchmark release, independent replication, SDK beyond Python

Sources

FAQ

Frequently Asked Questions

What is Perplexity Search as Code?

Search as Code is a Perplexity architecture announced in June 2026 where AI agents write Python scripts that define custom search workflows executed in a secure sandbox, instead of calling a fixed search API and stuffing raw results into the model context window.

How much does Search as Code reduce token usage?

On a Perplexity case study tracking 200 high-severity CVEs with vendor-specific advisories, the company reported about 42,900 tokens with Search as Code versus 288,700 tokens for its standard pipeline, roughly an 85 percent reduction, alongside 100 percent accuracy on that task.

Where is Perplexity Search as Code available?

Perplexity rolled out Search as Code in the Perplexity Agent API and made it the default architecture in Perplexity Computer as of early June 2026. The SDK runtime is Python-only initially.

Why does Search as Code matter for developer agent costs?

It demonstrates that filtering, deduplication, and ranking in deterministic code instead of LLM context can dramatically cut token bills — a pattern teams should copy as Copilot and frontier APIs shift to usage-based pricing.

How does Search as Code compare to MCP and function calling?

Perplexity argues traditional function calling and MCP force serial tool calls that pollute context with intermediate results. Search as Code lets one model turn compose thousands of retrieval operations in Python before returning a compact final answer.

Free Weekly Briefing

The AI & Dev Briefing

One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.

No spam. Unsubscribe anytime.

Free Tool

What should your project cost?

Get honest 2026 price ranges for any project type — website, SaaS, MVP, or e-commerce. No fluff.

Try the Website Cost Calculator →

Written by

Software Engineer based in Delhi, India. Writes about AI models, semiconductor supply chains, and tech geopolitics — covering the intersection of infrastructure and global events. 836+ posts cited by ChatGPT, Perplexity, and Gemini. Read in 164 countries.