Lead Intelligence Engine

This project is a CLI-first workflow for safer lead research and outreach preparation. Instead of scraping aggressively or sending messages automatically, it produces reviewable CSV artifacts at each stage so a human can inspect the lead data, fit signals, and draft messaging before any contact happens.

Project Snapshot

MVP implemented through Phase 4, with manual review kept as the final control.

The current implementation covers Google Places search, website enrichment, deterministic scoring, OpenAI Responses API analysis for qualified leads, structured JSON validation, and outreach review CSV generation.

Project Type

Python CLI automation workflow

Use Case

Lead research and outreach preparation

Core Workflow

Search - Enrich - Score - Analyze - Review

Status

MVP implemented through Phase 4

The Problem

Raw lead lists are easy to collect and hard to trust.

Business listings can provide names, ratings, websites, and phone numbers, but they do not explain whether a company is a good fit, what services may be relevant, or whether outreach should happen at all. Manual research is slow, while fully automated outreach is risky and low-trust.

Design goal

Prepare structured lead intelligence for human review by combining public business search, website enrichment, scoring rules, and bounded LLM analysis.

Terminal-First Workflow

$ lead-engine search --vertical marketing_agency --city "Dallas, TX" --max-results 10
Wrote 10 deduped raw leads to data/raw/places.csv

$ lead-engine enrich --input data/raw/places.csv --output data/processed/websites.csv --max-websites 10
Wrote website enrichment rows to data/processed/websites.csv

$ lead-engine score --input data/processed/websites.csv --output data/processed/leads_scored.csv
Wrote scored lead rows to data/processed/leads_scored.csv

$ lead-engine analyze --input data/processed/leads_scored.csv --output outputs/outreach_review.csv --limit 10 --dry-run
Dry run: qualified leads would be analyzed.

$ lead-engine analyze --input data/processed/leads_scored.csv --output outputs/outreach_review.csv --limit 10
Wrote outreach review rows to outputs/outreach_review.csv

The workflow is CLI-first and CSV-first. Each stage writes a concrete artifact so the process stays inspectable and easy to debug locally.

Pipeline

Google Places Search Raw Leads CSV Website Enrichment Heuristic Scoring Qualified Leads Only OpenAI Analysis Human Review CSV

OpenAI is used only after heuristic filtering marks a lead as worth deeper analysis, which helps control cost and avoids spending on poor-fit rows.

What It Does

Searches Google Places Text Search by vertical and city and exports raw business rows.
Visits public websites and extracts visible text, page metadata, contact signals, and important page links.
Scores each lead with deterministic rules before any LLM call is allowed.
Classifies likely vertical, size, pain signals, and service match from the enriched data.
Analyzes only qualified leads with the OpenAI Responses API using strict structured JSON validation.
Exports outreach review rows for human review rather than automated sending.

Key Features

Google Places Search

Finds businesses by city and vertical with normalized CSV output.

Public Website Enrichment

Extracts visible text, contact signals, page links, and public website metadata.

Deterministic Lead Scoring

Uses rules, thresholds, and service matching before any LLM analysis step.

Structured OpenAI Analysis

Validates qualified-lead analysis through strict structured JSON output.

Example Review Output

The final artifact is a decision-support CSV, not an outreach bot.

The implemented output includes columns such as company name, website, city, fit scores, pain signals, best service match, personalization angle, draft subject/body, recommended next action, status, and notes.

company_name,website,city,heuristic_fit_score,llm_fit_score,lead_quality,pain_signals,best_service,recommended_next_action,status
Acme Digital Studio,https://example.com,"San Diego, CA",82,79,high,"reporting; content QA",workflow automation,review,needs_review
Local Growth Agency,https://example.org,"Chula Vista, CA",71,74,medium,"lead follow-up; reporting",reporting automation,review,needs_review

Designed Boundaries

Safe, human-reviewed outreach preparation by design.

The repository explicitly avoids aggressive automation and treats human review as a required final step before any contact.

Guardrails in the project docs

No automated email sending
No contact form submission
No LinkedIn scraping
No CAPTCHA bypassing
Public website data only
Human review required

Technical Stack

Python 3.11+

CLI runtime and workflow orchestration.

Typer

Command-line interface for search, enrich, score, and analyze commands.

Requests + BeautifulSoup

Public website fetching and content extraction.

Pydantic + OpenAI

Structured analysis validation through the Responses API.

Why It Matters

A narrow workflow tool, not a generic AI platform claim.

This project demonstrates a practical approach to automation work: keep the workflow narrow, make every step inspectable, use simple files before heavier infrastructure, control API usage, and use AI only where added judgment helps.

What this demonstrates

Workflow automation over hype
Structured outputs and validation
Cost-aware LLM usage
Human-in-the-loop review

About This Demo Page

This webpage is a frontend-only showcase.

The actual project runs locally as a Python CLI and requires user-provided Google Places and OpenAI API keys. This public page does not expose backend services, API keys, private data, or real lead records.

Scope of the website page

Static HTML only, focused on explaining the workflow, boundaries, and technical implementation without implying a live SaaS product.

Contact

Need a custom internal workflow like this?

I build lightweight Python automation tools and AI-assisted workflows for teams with repetitive operational processes.

Contact Me