Data Intelligence Pipeline

Automated Workflow Discovery and AI Rating at Scale

Teams manually scouting automation templates faced high discovery time and inconsistent quality assessment. We built a durable pipeline that discovers workflows daily, scores them with AI, archives raw artifacts, and indexes everything for semantic search.

Business Impact

$70K+saved/year

Replaces a full-time research analyst's daily scouting and scoring output

Business Impact

Executive Outcomes

250workflows/day

Replaces hours of daily manual research and scouting

$100/mo ceiling

Hard budget cap prevents AI cost surprises

100%automated

Scrape, score, archive, and search run end-to-end

$70K+saved/year

Replaces a full-time research analyst's daily scouting output

The Challenge

The team needed high-quality workflow intelligence but was spending too much manual time on discovery, evaluation, and technical retrieval. There was no structured way to assess workflow quality at scale or search historical patterns.

Discovering and evaluating automation workflows required hours of manual scouting every day

No structured way to assess workflow quality at scale, leading to inconsistent recommendations

Historical workflows were not archived or searchable, making pattern discovery across projects impossible

LLM scoring costs were unpredictable with no per-run, daily, or monthly budget controls

External API failures during long-running batch jobs could lose hours of processing with no recovery path

The Transformation

What changed after we built the system

Before

Discovering and evaluating automation workflows required hours of manual scouting every day

After

Automated three-phase pipeline scrapes 250 workflows daily on a predictable morning schedule

Before

No structured way to assess workflow quality at scale, leading to inconsistent recommendations

After

AI-powered quality scoring with structured JSON validation produces consistent evaluations at scale

Before

Historical workflows were not archived or searchable, making pattern discovery across projects impossible

After

Raw JSON artifacts archived in Google Drive and semantically indexed in Qdrant for instant search

Before

LLM scoring costs were unpredictable with no per-run, daily, or monthly budget controls

After

Three-tier cost caps ($0.10 per workflow, $10 per day, $100 per month) enforce total budget predictability

Before

External API failures during long-running batch jobs could lose hours of processing with no recovery path

After

Task composition architecture with metadata tracking and fault isolation enables clean recovery from failures

Why three-tier cost caps changed the economics

LLM-based evaluation is powerful but expensive at scale. Scoring 250 workflows daily with no guardrails could easily produce surprise bills if model pricing changes or output volume spikes.

The first cap is per-workflow: $0.10 maximum. If a single evaluation exceeds that, the run terminates and the workflow gets flagged for manual review. This prevents any one item from burning through budget.

The second and third caps are daily ($10) and monthly ($100). When either limit is reached, remaining workflows queue for the next period. This makes the pipeline's cost completely predictable. The team knows exactly what the maximum bill will be before the month starts.

How We Built It

Technical architecture for the curious

Scraping

Morning scrape pipeline with proxy rotation and explicit status tracking for quota exhaustion and missing resources.

Proxy RotationDirect-connection FallbackStatus Stamping

AI Rating

Structured scoring bounded at $0.10 per workflow. Validation with repair handles inconsistent model output.

OpenRouterCost-capped EvaluationJSON Repair Heuristics

Archive

Every workflow's raw JSON is stored in Drive. Backfill pipeline processes up to 500 artifacts daily.

Google DriveRaw JSON StorageDaily Backfill Pipeline

Search

Summary and technical vector collections enable semantic search. Full and incremental reindex keep the index current.

Qdrant Dual CollectionsOpenAI EmbeddingsIncremental Reindex

Operations

Sheets as operational record for non-engineering stakeholders. Trigger.dev middleware for singleton service initialization.

Google SheetsTrigger.dev v4 Locals/Middleware
Trigger.dev v4
Google Sheets
Google Drive
OpenRouter
Qdrant
OpenAI Embeddings
TypeScript

Engineering Decisions

Tradeoffs we made and why

Three-tier cost caps instead of uncapped LLM scoring

Benefit

Budget predictability at per-run ($0.10), daily ($10), and monthly ($100) levels with no surprises

Cost

Some high-value workflows may not get scored if the daily cap is reached early

Proxy rotation with direct-connection fallback

Benefit

Maintains scraping throughput when the primary proxy is rate-limited or down

Cost

Direct connections are more visible and more likely to be blocked by target sites

Google Sheets as the operational record instead of a database dashboard

Benefit

Non-engineering stakeholders can monitor pipeline health without any database access

Cost

Sheet size limits and slower performance compared to a dedicated monitoring tool at high volume

Scheduled three-phase pipeline instead of event-driven processing

Benefit

Predictable resource usage, clear phase boundaries, and independent failure domains

Cost

Fixed schedule cannot respond to real-time content spikes or priority changes during the day

Certain client names, proprietary workflows, screenshots, and internal assets referenced in this case study are protected under a non-disclosure agreement and have been anonymized or omitted to comply with our confidentiality obligations.

Need intelligence pipelines that stay within budget?

Book a free 30-minute call. We will assess your data pipeline needs, identify where costs are unpredictable, and design a system with built-in budget controls.

30 minutes with Apurva. Not a sales call.

Book Your Free Audit