Benchmarks
Two benchmarks measuring the efficiency of using procurement.txt versus traditional web scraping. Both benchmarks task an agent with the same procurement workflow: find M10 hex bolts, get a quote for 500 units, place an order if the price is under $0.45/unit, and retrieve tracking information. Both approaches complete the task — procurement.txt does it faster, cheaper, and with less data.
Benchmark 1: Structured Agent
deterministicA scripted Python agent (no LLM) running against a local mock merchant server. Two variants: a scraping agent that navigates HTML pages, and a procurement.txt agent that fetches /procurement.txt, discovers the OpenAPI spec, and uses JSON APIs. Both complete the full workflow.
Task completion time
Without
0.051s
With procurement.txt
0.024s
~2x faster
Data transferred
Without
~140 KB HTML vs JSON
With procurement.txt
~15 KB HTML vs JSON
~10x less bandwidth
HTTP requests
Without
7 requests
With procurement.txt
8 requests
Similar request count
Response payload
Without
HTML pages
With procurement.txt
JSON responses
Structured data, no parsing overhead
Run-by-run results
| Run | Scraping agent | procurement.txt agent | Notes |
|---|---|---|---|
| 1 | 0.051s, 7 reqs, ~140 KB | 0.025s, 7 reqs, ~15 KB | procurement.txt agent needed 1 retry to narrow catalog search |
| 2 | 0.051s, 7 reqs, ~140 KB | 0.025s, 8 reqs, ~15 KB | Both completed full workflow |
| 3-6 | ~0.051s, 7 reqs, ~140 KB | ~0.024s, 8 reqs, ~15 KB | Consistent results across all runs |
6 runs with both agents. The procurement.txt agent uses slightly more HTTP requests (8 vs 7) due to catalog search pagination, but transfers ~10x less data overall.
Key observation
Both agents found human escalation channels, but through different paths. The scraping agent found phone and email from the HTML footer. The procurement.txt agent found live-chat and email from the structured Escalation field — a richer, more machine-parseable result.
Benchmark 2: LLM Agent
claude-sonnet-4-20250514A Claude Sonnet agent benchmarked on the same procurement task with real API calls, token consumption, and cost tracking. Two conditions: one system prompt instructing the agent to browse and scrape HTML, and another instructing it to check for /procurement.txt first. Both approaches complete the task, but the efficiency difference is significant.
Average cost per run
Without
$1.23
With procurement.txt
$0.20
~6x cheaper
Average tokens consumed
Without
~341K
With procurement.txt
~58K
~5.8x fewer tokens
Average elapsed time
Without
~713s (~12 min vs ~3.5 min)
With procurement.txt
~213s (~12 min vs ~3.5 min)
~3.3x faster
Average tool result data
Without
~120 KB
With procurement.txt
~21 KB
~5.7x less data processed
Run-by-run results
| Run | Without procurement.txt | With procurement.txt |
|---|---|---|
| 1 | 4 turns · 53K tokens · $0.21 · 62s | 11 turns · 64K tokens · $0.22 · 92s |
| 2 | 12 turns · 295K tokens · $1.10 · 460s | 10 turns · 57K tokens · $0.19 · 161s |
| 3 | 16 turns · 499K tokens · $1.82 · 945s | 10 turns · 55K tokens · $0.19 · 371s |
| 4 | 12 turns · 518K tokens · $1.78 · 1383s | 10 turns · 57K tokens · $0.19 · 227s |
4 runs per condition. Without procurement.txt, the worst-case run consumed 518K tokens ($1.78) and took 23 minutes to complete the same task.
Why scraping is less efficient
Without procurement.txt, the LLM agent receives large HTML pages (18–28 KB each) containing navigation, styling, and other content irrelevant to the procurement task. The agent must parse these pages, extract form fields, and reason about page structure — all of which consumes tokens. With procurement.txt, the agent works with compact JSON API responses (~21 KB total vs ~120 KB), spending tokens on the actual task rather than on interpreting page layout.
Tool usage patterns
Without procurement.txt
With procurement.txt
The procurement.txt path uses fewer, simpler tools — no HTML parsing or form extraction needed.
Summary
Across both benchmarks, agents using procurement.txt consumed significantly less data (~10x less bandwidth in the structured test, ~5.7x less tool result data in the LLM test) and completed tasks faster (~2x in the structured test, ~3.3x in the LLM test).
The LLM benchmark showed the most dramatic efficiency gains: a 6x cost reduction ($0.20 vs $1.23 average per run) and 5.8x fewer tokens consumed. The scraping approach works, but it forces the agent to spend most of its time and budget parsing HTML rather than executing the procurement workflow. Providing structured, machine-readable metadata lets agents focus on the task itself.