Turning Instagram DMs Into Qualified Pipeline
A digital agency needed to process thousands of Instagram DM conversations daily, qualify leads with AI, and route qualified prospects into their CRM. Their n8n workflow was breaking under volume. We rebuilt it as a typed, code-first automation system.
Business Impact
Prospects qualified and routed to CRM without manual triage
Business Impact
Executive Outcomes
From inbox chaos to structured qualified pipeline
Deduplication ensures every qualified prospect reaches the sales team
Automated the work of 2 full-time manual DM sorters
Automated qualification drops acquisition cost to near zero
The Challenge
“The agency's Instagram DM volume had outgrown their visual workflow tool. Repeated processing of conversation batches, username quality issues from chat metadata, and queue pressure during enrichment spikes were causing daily failures.”
n8n workflow broke daily under volume, reprocessing the same conversation batches repeatedly
Username quality issues from chat metadata caused enrichment to fail silently on invalid profiles
Queue pressure during enrichment spikes caused cascading failures across the entire pipeline
Duplicate CRM tasks piled up because no deduplication existed at the username level
No recovery mechanism when data was lost or CRM entries were cleared accidentally
The Transformation
What changed after we built the system
n8n workflow broke daily under volume, reprocessing the same conversation batches repeatedly
Cursor-based pagination with BigQuery checkpoints processes only new conversations each run
Username quality issues from chat metadata caused enrichment to fail silently on invalid profiles
Invalid usernames and low-signal leads are filtered before enrichment, reducing API costs and failures
Queue pressure during enrichment spikes caused cascading failures across the entire pipeline
Batched enrichment with configurable caps prevents queue saturation during volume spikes
Duplicate CRM tasks piled up because no deduplication existed at the username level
Dual deduplication at message and username levels eliminates both reprocessing and duplicate CRM tasks
No recovery mechanism when data was lost or CRM entries were cleared accidentally
Manual reset and CSV backfill jobs support operational recovery without ad hoc scripts
Why deduplication needed to happen at two different levels
The first instinct was to deduplicate at the message level: if we have already seen this message ID, skip it. That prevented reprocessing but did not solve the CRM problem.
A single lead sends multiple messages across multiple conversations. Each conversation would pass message-level dedup because the messages were genuinely new. But the lead had already been qualified and added to ClickUp. The result: five CRM tasks for the same person.
The second dedup layer checks at the username level in ClickUp before creating any task. This two-level approach means we never reprocess messages (saving compute) and never create duplicate CRM entries (saving the sales team from chasing the same lead five times).
How We Built It
Technical architecture for the curious
Ingestion
Incremental ingestion using timestamp checkpoints. Only new conversations are processed each run.
Deduplication
Two-layer dedup prevents both reprocessing of messages and creation of duplicate CRM tasks.
Enrichment
Profile and web data pulled for valid leads only. Batching caps at 400 jobs per run prevent queue overload.
AI Qualification
AI evaluates conversation context, profile data, and web presence to produce structured qualification scores.
CRM
Qualified leads become ClickUp tasks. Dedup checks prevent duplicates before any task creation.
Engineering Decisions
Tradeoffs we made and why
BigQuery as both state store and dedup engine
Benefit
Single system handles timestamp checkpoints, message dedup, and analytical queries
Cost
Higher query latency compared to a dedicated cache or Redis for dedup lookups
400-job batch cap per enrichment run
Benefit
Prevents queue overload and keeps downstream services (Apify, Tavily) within rate limits
Cost
High-volume days may require multiple runs to process all pending conversations
Pre-enrichment filtering on username quality
Benefit
Significant cost savings by skipping Apify and Tavily calls for leads that will not qualify
Cost
Some edge-case valid leads with unusual usernames may be filtered prematurely
Scheduled 4-hour ingestion cycles instead of real-time webhooks
Benefit
Simpler architecture with predictable resource usage and natural batching for dedup
Cost
Up to 4 hours of delay between a DM arriving and a CRM task being created
Certain client names, proprietary workflows, screenshots, and internal assets referenced in this case study are protected under a non-disclosure agreement and have been anonymized or omitted to comply with our confidentiality obligations.
Drowning in DMs with no system to catch leads?
Book a free 30-minute call. We will map your current lead flow, find where prospects are falling through the cracks, and design an automation that qualifies them for you.
30 minutes with Apurva. Not a sales call.
Book Your Free Audit