Automated Lead List Building

GTM Engineering

Context

A sales team needed 6,500 new contacts per month to hit their outbound target. List building was manual: SDRs researched companies, looked up contacts on LinkedIn, enriched phone numbers through Lusha, created records in HubSpot, then loaded batches into a dialer by hand. About 30% of phone numbers were invalid. Every hour spent on admin was an hour not spent calling.

Scope

Build a pipeline that automates the full list building workflow: company discovery, contact search, enrichment, validation, and CRM delivery. A codebase the team can own, extend, and run on a schedule.

Python icon

Python

HubSpot icon

HubSpot

Lusha icon

Lusha

Apollo icon

Apollo

Approach

I built the pipeline in Python with a pluggable architecture. Each stage of the workflow — sourcing, enrichment, validation, CRM — is an abstract interface. Swapping a data provider means implementing one class, not rewiring the pipeline.

Three data sources are supported: Lusha's Prospecting API for direct contact search (combines company and role filters in a single call, avoiding the N+1 problem), Apollo for company-first discovery with separate contact lookup, and CSV for seed lists. Enrichment runs in batches of up to 100 contacts, filling email, phone, and LinkedIn from the search phase's request context.

Validation catches bad data before it hits the CRM: email format checks, phone format normalization, and batch deduplication by email. The HubSpot integration follows a "fill empty fields only" strategy — if an SDR has already corrected a phone number, the pipeline won't overwrite it.

wide

Automated lead generation pipeline

Contacts are split into two persona tiers: senior decision makers (buyers) and operational implementers (champions). Each with their own title patterns and seniority filters, including German-language titles for the DACH market. Every run produces a structured summary with counts, errors, and tier breakdowns.

Outcome

The pipeline replaces hours of daily manual research with a single command. It sources contacts, enriches them, validates data quality, deduplicates against existing CRM records, and pushes clean leads to HubSpot with full metadata. SDRs wake up to a loaded queue instead of spending their morning on admin.