Tool Guide

Your job is integration, synthesis, and presentation. The verification layer underneath has to come from somewhere.

Firecrawl Review: How We Use It to Verify Every Article We Publish

Three months in, Firecrawl runs inside our deep-research skill on every article we publish. Here is what changed in our workflow, what it costs us, and where it falls down.

Michael NourielPlatform Engineer & Founder, Scaletific + Automation Switch

26 April 202612 min readTool Guide

AIautomationworkflowdeveloper tools

Dark editorial hero image with the word FIRECRAWL set large in bold amber, three stat chips below reading 96% COVERAGE, 3.4s P95, and 67% TOKEN REDUCTION in cream on dark cards, and the kicker Search. Scrape. Validate. Publish. set in small Plex Sans italic beneath.

Key takeaways

Verification is the slowest part of publishing in the AI era. The slow work is making sure every claim is actually true.
We adopted Firecrawl three months ago and wired it into a deep-research skill that runs on every article before publish. It searches, scrapes, and validates source claims as part of the writing workflow.
A typical article costs us 25 to 70 Firecrawl credits to validate. Our hyperautomation deep-dive cost 67 credits and caught a fabricated 78% statistic before publish.
Firecrawl's coverage and speed are good enough that we have stayed within it for every production validation run. The 4% it misses tends to be login-walled or aggressively rate-limited sites where we would not have published the source anyway.
We would still pick Firecrawl today. If you are solving a similar verification problem, here is the workflow we use and the trade-offs we have hit.

The Bottleneck Is Verification

If you publish under your name, the slowest part of the work is making sure everything you say is actually true.

Reading source pages. Cross-checking what they actually say against what you remember them saying. Tracking down whether a statistic exists in the report it gets attributed to. Confirming that the version of a tool you are describing is the version that exists right now, rather than the version your training data remembers from eight months ago.

That work used to take us hours per article. Sometimes a full day before we would let a piece publish. The friction sat in the verification layer underneath the draft. The part the reader sees is the article. The part they trust is the verification work they assume happened underneath it.

This is the part of publishing that AI-assisted drafting has made harder. The writing layer compresses from days to hours. The verification layer expands, because every confident-sounding sentence the model produces now needs an explicit source check before it ships.

We have been running Firecrawl as the extraction engine inside our research and validation workflow for three months. It is wired into an agent skill that runs before every article goes live. This piece walks through what changed in our workflow when we adopted it, what it actually costs us, where we have hit limits, and what we would suggest if you are working on the same problem.

Architecture diagram showing a five-stage left-to-right data pipeline: Firecrawl performs search and scrape, agent synthesis reasons over multi-source content, Notion stages drafts for review, Sanity CMS publishes, and the live Automation Switch site delivers the article. Amber connector lines link each stage.

What the Workflow Used to Look Like

Before Firecrawl, our research and verification process was a mix of three things:

Manual reading. One of us would open every cited source in a tab, read the relevant section, and copy the supporting quote into a working document. For an article with twelve sources, that was three to four hours of work.

Custom scrapers. For repeat tasks like checking pricing pages or release notes, we wrote one-off Python scripts. Each one needed proxy handling, retry logic, JavaScript rendering for SPAs, and HTML cleaning. Each one broke the moment the target site updated.

Search-then-paste. When we needed quick fact-checks, we would run a Google search, open the top three results, skim, and paste relevant text into the article. This was the fastest path and also the lowest-fidelity. Hallucinations slipped through this layer regularly.

The work was inconsistent because the tooling was inconsistent. The articles we shipped felt verified because we had done the reading. We could prove they were verified only when we had a reliable system that would catch claims that drifted from their sources.

What We Tried Before Settling on Firecrawl

We evaluated a shortlist before arriving at Firecrawl:

Beautiful Soup + Playwright. Fine for one-off scrapes. Painful as soon as the target uses JavaScript heavily or rotates anti-bot challenges. We were maintaining proxy infrastructure within a week.

Apify. Solid platform, especially for crawl-heavy work. Pricing got expensive fast for our use pattern. Setup overhead was higher than we wanted for the verification workflow specifically.

ScrapingBee. Closer to what we wanted in terms of zero-config rendering. The output was raw HTML, which meant we were burning LLM tokens converting it to markdown for the agent to process.

Tavily. Excellent for search-first workflows where you want ranked, summarised results. Less suited to the case where we knew exactly which URL we needed to extract from.

Firecrawl won on a specific combination: native markdown output, zero infrastructure overhead, and an MCP integration that let us call it natively from inside the agent skills we were already building. The token-economics alone made the case. Cleaner output meant smaller prompts, which meant faster validation runs at lower cost.

How We Use It

Firecrawl runs inside a single skill we call deep-research, defined at .claude/skills/deep-research/SKILL.md. The skill is a generic five-phase research protocol that covers any subject we need to research: a framework, a tool, a platform, a market, a competitor.

The shape of the skill, abbreviated:

text

Phase 1 — Define the research target
  Subject, output format, decision it informs, scope, freshness need

Phase 2 — SEO intent mapping (search first)
  Run search queries via Firecrawl before crawling
  Map what already ranks, what angles exist, where the gaps are

Phase 3 — Deep content crawl
  Crawl the primary sources identified in Phase 2
  Extract clean markdown for each source

Phase 4 — Synthesis
  Cross-reference sources, build the comparison or article structure
  Flag claims that any single source cannot support

Phase 5 — Output
  Write to articleSources[] for articles, or directory entry, or PRD inputs

The Firecrawl dashboard showing the MCP server config, the CLI installer command, and the integrations grid including LangChain, n8n, Make, Zapier, Dify, Pipedream, CrewAI, LlamaIndex, Composio, plus the Python and JS SDKs. — The Firecrawl dashboard. The MCP Config block is the JSON we drop into our agent runtime so the agent can call Firecrawl tools natively.

The agent invokes Firecrawl automatically inside Phases 2 and 3. Search results from Phase 2 feed the URLs that get crawled in Phase 3. The output of Phase 3 feeds Phase 4 synthesis. The agent calls the skill, the skill calls Firecrawl, the structured output flows into Notion as the staging layer before publication.

The reason this works is that Firecrawl returns markdown that the next agent step can process directly. Raw HTML would require a normalisation pass that adds latency and cost. The clean output is what makes the chain runnable end-to-end.

A Real Example: The Hyperautomation Fact-Check

Last month we published a deep-dive on whether hyperautomation is dead. The draft contained ten specific factual claims: Gartner market projections, UiPath financial figures, G2 survey statistics, Celonis pricing data.

We ran every claim through the deep-research skill before publish. Firecrawl pulled the original source for each claim and the agent compared what we had written against what the source actually said.

Six claims verified cleanly. Four flagged for review. One specific 78% figure, which we had attributed to a named industry report, did not appear in that report at all. The number existed in our draft. It did not exist in the cited source. We removed the claim before publishing.

credits to validate the hyperautomation article

10 claims checked, 6 verified, 4 flagged, 1 fabricated statistic caught

Source: Internal Firecrawl usage log

That single check was the difference between publishing under our name with confidence and publishing something we could not defend. The article moved from draft to publish-ready in one validation session, with every remaining claim traceable to a source we had actually verified.

The Numbers

Firecrawl publishes a set of platform-level metrics. We have observed these figures in production rather than benchmarking independently, and they match what Firecrawl publishes:

96%

web coverage

Including JavaScript-heavy pages, single-page applications, and dynamically loaded content

Source: Firecrawl platform data

3.4s

P95 latency

From request to clean markdown output across millions of pages

Source: Firecrawl platform data

67%

token reduction

Versus raw HTML. Markdown preserves heading hierarchy without rendering scaffolding

Source: Firecrawl platform data

77.2%

extraction coverage

In third-party benchmarks (Apify and AI Multiple), versus 67.8% for the nearest comparable tool

Source: Apify and AI Multiple benchmarks

100K+

GitHub stars

The largest open-source repository in the web scraping space. Zero configuration for proxies, anti-bot, rate limits, and JS rendering.

Source: GitHub

In our actual usage, the latency feels closer to 2-4 seconds for typical pages and 5-8 seconds for heavy SPAs. The markdown output is genuinely clean. Post-processing code is unnecessary.

Where It Reaches Its Limits

Three honest limitations we have hit, all worth knowing before adopting it:

The 96% web coverage figure includes most of the public web. The remaining 4% is where you find login-only docs, sites with strict bot rate limiting per source domain, and a handful of high-profile sites that explicitly block scraping infrastructure. We have run into this on a few competitor pricing pages and one industry report site that wanted us to log in. The Interact endpoint can handle some of these via browser automation, though coverage varies. If you are doing competitive intelligence on heavily defended targets, plan for a fallback.

The Firecrawl Interact playground showing natural-language browser automation with options for Click and Fill, Prompt your actions, and Extract behind interaction. — The Interact playground. When the standard scrape hits a login wall or aggressive rate limit, the Interact endpoint runs natural-language browser automation as a fallback.

Credit consumption is predictable, then occasionally spikes

A typical article validation run costs us 25 to 70 credits. Most months we are at 600 to 1,000 credits across all our research. The exceptions are when we crawl a site with deeper pagination than we expected, or when a JavaScript-heavy single-page app needs full render to extract anything useful. One crawl of a poorly-paginated documentation site burned 200 credits before we caught it. Set a max-pages parameter on crawls when you do not know the depth in advance.

Schema-driven JSON extraction works well for shallow data

The JSON extraction mode lets you pass a schema and get structured data back, which we use for pricing tables, comparison matrices, and feature grids. It works cleanly for flat structures. For deeply nested data, like multi-level menu structures or threaded comments, we have had to fall back to markdown extraction and parse on our side. The schema mode is a strong feature for flat structures, with markdown extraction as your fallback for nested shapes.

All three are manageable within Firecrawl's workflow. They have shaped how we use it. If you adopt it, expect to learn the shape of the failure modes for your specific targets.

What It Costs Us

Firecrawl usage dashboard showing 955 credits consumed over 30 days with a bar chart of daily usage. A spike on the right represents a 200-credit mistake-crawl on a deeply-paginated documentation site. — Last 30 days of Firecrawl usage from our account. The spike on the right is the 200-credit mistake-crawl on a deeply-paginated documentation site.

We started on the free tier with 500 one-time credits, which lasted us about ten days of active use. We moved to the Hobby plan at $16 per month for 3,000 monthly credits.

Over the last 30 days, we used 955 credits across:

Three articles validated end-to-end (67 + 41 + 28 credits)
One full directory entry research run for the agent frameworks page (180 credits, including Phase 2 search and Phase 3 crawl across 7 framework documentation sites)
Routine fact-checks across other pieces in production (roughly 290 credits)
One mistake-crawl on a deeply paginated documentation site (200 credits)

Per-article, our typical validation run sits between 25 and 70 credits. At Hobby pricing, that is roughly $0.13 to $0.37 per article in tooling cost. The previous workflow consumed three to four hours of human time per article. The economics required zero analysis.

Firecrawl

Firecrawl pricing at a glance

Free

$0once

500 credits to evaluate the platform

✓500 one-time credits
✓2 concurrent requests
✓Search, scrape, crawl, map

Start free

Recommended

Hobby

$16/mo

Where most solo operators start

✓3,000 monthly credits
✓5 concurrent requests
✓Basic support
✓1 credit = 1 page

Start with Hobby

Standard

$83/mo

For teams running daily research pipelines

✓100,000 monthly credits
✓50 concurrent requests
✓Standard support
✓Auto-recharge available
✓Billed annually

Go Standard

Growth

$333/mo

For high-volume extraction at scale

✓500,000 monthly credits
✓100 concurrent requests
✓Priority support
✓Billed annually

Go Growth

One credit equals one webpage extracted, or one PDF page, or one search result. The pricing is transparent and predictable. Every invoice has matched our usage expectations.

How It Stacks Up

The shortlist we evaluated, with how we would frame each one now:

Tavily is search-first. If your job is "given a question, find ranked answers," Tavily is the more direct fit. We use Firecrawl because our flow is closer to "given specific URLs, give me clean structured content," with search as a sub-step.

Apify is the right answer for heavy crawl jobs across many sites with custom scrapers per source. If you are running a data pipeline at scale with bespoke logic per target, Apify earns its place. For verification workflows running inside an agent skill, the overhead is higher than we wanted.

Bright Data and ScrapingBee are strong on the infrastructure side. We tested Firecrawl more deeply because the markdown output and MCP integration made the rest of the comparison secondary for our use case.

Browserbase is a different category, more about running browser automation at scale than extraction. Useful for interaction-heavy flows. A different category from what our verification workflow required.

There is a context-dependent right tool here, shaped by the workflow you are running. For our verification-and-research workflow with the AS publication pipeline downstream, Firecrawl is the one that closes the loop with the least overhead.

Where This Fits in the Broader System

Firecrawl is one piece of a larger publication system we have been building. The full chain runs:

text

Notion (drafting) → deep-research skill (research + validation, powered by Firecrawl) →
articleSources[] (structured trust layer) → Sanity (CMS) → Next.js (publishing) →
JSON-LD + llms.txt (machine-readable surfaces)

The point of the chain is that every layer earns trust the next layer relies on. Drafts in Notion start as ideas. The deep-research skill turns claims into source-backed statements. The articleSources[] block carries those sources through to the live page. The JSON-LD and llms.txt surfaces let agents and search engines verify the same trust signals programmatically.

Firecrawl sits at the second step in this chain. Without the extraction layer, the trust layer underneath everything else is manual and inconsistent. With it, the layer is durable enough to scale.

This is the strategic reason we keep using Firecrawl rather than rotating tools every few months. It is the dependable piece that lets the cleverer pieces run.

What We Would Suggest

Three months in, we are still using Firecrawl. We expect to keep using it. The economics work, the failure modes are knowable, and it slots into the agent workflow with minimal friction. If you are building a similar workflow, here is what we would suggest:

How to get started with Firecrawl

01
Start with the free 500 credits
Run your existing research workflow through Firecrawl manually for a week. You will see whether the markdown output and the API ergonomics fit your use case before you spend anything. Sign up at firecrawl.dev.
02
Wire it into a skill, rather than a one-off script
The value compounds when Firecrawl is invoked automatically as part of an agent task, rather than called manually. The deep-research skill structure we use is generic enough that you can adapt it to your domain in an afternoon.
03
Set max-pages limits on crawl jobs
This catches the credit-spike scenario before it bills you. Default to the tightest crawl scope you can. Loosen it only when you know the target.
04
Treat schema-driven JSON extraction as a strong, focused tool
Use it for shallow structured data. Keep markdown extraction as your fallback for nested or unpredictable shapes.
05
Save your validation runs
Write the output of each Firecrawl call into .firecrawl/ in your repo. That gives you a reproducible record of which sources you checked, when you checked them, and what they actually said.

Firecrawl is one tool in a larger research and publishing pipeline. For the SKILL.md format that powers our deep-research skill, see the SKILL.md format and the agent skills directory. For another workflow that runs on the same research stack, see the cold outreach automation stack.

Frequently asked questions

Yes. The free tier includes 500 one-time credits with only an email signup required. That is enough to run a meaningful evaluation. Paid plans start at $16 per month for 3,000 monthly credits.

Yes. JavaScript renders automatically, including SPAs and dynamic content. Configuration is handled automatically.

Yes. The core is open-source and deployable on your own infrastructure. The hosted version layers on Fire-engine, the proprietary infrastructure for proxies, anti-bot handling, and rendering. If vendor independence matters to you, the open-source path is real.

Different shapes. Firecrawl is extraction-first: give it URLs, get clean markdown back. Tavily is search-first: give it a query, get ranked summarised results. We use Firecrawl because our workflow is "verify these specific sources" more often than "find me sources for this topic."

Yes. The MCP server exposes 12 tools through the Model Context Protocol, including scraping, crawling, searching, mapping, batch processing, and browser automation. It works natively with Claude Code, Cursor, and other MCP-compatible agents. The MCP integration is the reason Firecrawl is callable from inside an agent skill without context switching.

You get an error response with the failure reason. Most failures are login walls, aggressive rate limits, or sites that explicitly block scraping infrastructure. The Interact endpoint handles some of these via browser automation. For the rest, we either find an alternate source or note the gap and proceed without that claim.

Article Sources8 referencesShow referencesHide references

We reviewed the sources below to support the claims, pricing, and benchmarks referenced in this article.

Firecrawl official platform homepage
FirecrawlProduct page
80,000+ companies, 96% web coverage, P95 latency 3.4s, 100,000+ GitHub stars
Firecrawl pricing page
FirecrawlPricing page
Pricing tiers, credit costs, concurrency limits
Firecrawl customer stories
FirecrawlCase studies
Usage by Apple, Shopify, Zapier, Canva, Replit, Gamma
Firecrawl integrations documentation
FirecrawlDocumentation
LangChain, LlamaIndex, CrewAI, MCP server, SDK availability
Firecrawl vs Tavily comparison
FirecrawlComparison
77.2% vs 67.8% extraction coverage, 67% token reduction, pricing comparison
Firecrawl vs Tavily independent comparison
ApifyIndependent review
Third-party comparison of extraction-first vs search-first positioning
Agentic Search benchmark 2026
AI MultipleBenchmark
Benchmark methodology and performance data
Internal Firecrawl usage logs
Automation SwitchInternal data
Per-article credit consumption, 955 credits over 30 days, hyperautomation fact-check results

Written by

Michael Nouriel

Platform Engineer & Founder, Scaletific + Automation Switch

Michael Nouriel is a platform engineer and founder of Scaletific and Automation Switch. He builds governed AI execution infrastructure, including GoldenPath IDP and AEP, a runtime enforcement layer for AI-assisted software delivery. He writes about automation engineering, cloud infrastructure, and what it actually takes to run AI agents in production.

*Affiliate link. We earn a small commission if you sign up, at no extra cost to you. We only recommend tools we use ourselves on AutomationSwitch.

Firecrawl Review: How We Use It to Verify Every Article We Publish

The Bottleneck Is Verification

What the Workflow Used to Look Like

What We Tried Before Settling on Firecrawl

How We Use It

A Real Example: The Hyperautomation Fact-Check

The Numbers

Where It Reaches Its Limits

Login-walled and aggressively rate-limited sites are still hard

Credit consumption is predictable, then occasionally spikes

Schema-driven JSON extraction works well for shallow data

What It Costs Us

How It Stacks Up

Where This Fits in the Broader System

What We Would Suggest