Automated Lead List Building
GTM Engineering
Context
A sales team needed 6,500 new contacts per month to hit their outbound target. List building was manual: SDRs researched companies, looked up contacts on LinkedIn, enriched phone numbers through Lusha, created records in HubSpot, then loaded batches into a dialer by hand. About 30% of phone numbers were invalid. Every hour spent on admin was an hour not spent calling.
Scope
Build a pipeline that automates the full list building workflow: company discovery, contact search, enrichment, validation, and CRM delivery. A codebase the team can own, extend, and run on a schedule.

Python

HubSpot

Lusha

Apollo
Approach
I built the pipeline in Python with a pluggable architecture. Each stage of the workflow — sourcing, enrichment, validation, CRM — is an abstract interface. Swapping a data provider means implementing one class, not rewiring the pipeline.
Three data sources are supported: Lusha's Prospecting API for direct contact search (combines company and role filters in a single call, avoiding the N+1 problem), Apollo for company-first discovery with separate contact lookup, and CSV for seed lists. Enrichment runs in batches of up to 100 contacts, filling email, phone, and LinkedIn from the search phase's request context.
Validation catches bad data before it hits the CRM: email format checks, phone format normalization, and batch deduplication by email. The HubSpot integration follows a "fill empty fields only" strategy — if an SDR has already corrected a phone number, the pipeline won't overwrite it.
Automated lead generation pipeline
Contacts are split into two persona tiers: senior decision makers (buyers) and operational implementers (champions). Each with their own title patterns and seniority filters, including German-language titles for the DACH market. Every run produces a structured summary with counts, errors, and tier breakdowns.
Outcome
The pipeline replaces hours of daily manual research with a single command. It sources contacts, enriches them, validates data quality, deduplicates against existing CRM records, and pushes clean leads to HubSpot with full metadata. SDRs wake up to a loaded queue instead of spending their morning on admin.