Built for startups,
scaled for unicorns
Successfully submitted!
Error! Please try again
Your data warehouse is where the deepest referral program insights live. While real-time dashboards show you what's happening now, warehouse-powered analytics reveal long-term patterns: which referral channels produce the highest-LTV customers, how referral quality changes over time, and what your true referral program ROI looks like when you account for all costs.
This guide covers configuring Segment to load GrowSurf referral data into your data warehouse (BigQuery, Snowflake, Redshift, or Databricks), structuring referral tables for efficient querying, and building SQL-based analytics that answer the strategic questions your referral program team needs answered.
Connect your data warehouse to Segment so referral events are automatically loaded.
Segment creates tables in your warehouse automatically based on your events and identifies.
tracks β all Track events (includes referral events)identifies β all Identify calls (includes referral user traits)referral_program_joined β auto-created table for this specific eventreferral_made β auto-created table for referral eventsreferral_converted β auto-created table for conversion eventsuser_id, timestamp, and all event properties as columnsWrite SQL queries against your warehouse tables to answer key referral program questions.
Build cohort queries that track referred customer behavior over time compared to non-referred customers.
Create a data model that attributes downstream revenue to the original referral for complete ROI tracking.
Connect a BI tool to your warehouse to create referral program dashboards and scheduled reports.
-- Core Referral Analytics SQL Queries
-- 1. Referral Funnel by Month
SELECT
DATE_TRUNC('month', rpj.timestamp) AS month,
COUNT(DISTINCT rpj.user_id) AS participants_joined,
COUNT(DISTINCT rm.user_id) AS made_referral,
COUNT(DISTINCT rc.user_id) AS referral_converted,
ROUND(COUNT(DISTINCT rm.user_id)::decimal /
NULLIF(COUNT(DISTINCT rpj.user_id), 0) * 100, 1) AS referral_rate_pct,
ROUND(COUNT(DISTINCT rc.user_id)::decimal /
NULLIF(COUNT(DISTINCT rm.user_id), 0) * 100, 1) AS conversion_rate_pct
FROM referral_program_joined rpj
LEFT JOIN referral_made rm ON rpj.user_id = rm.user_id
LEFT JOIN referral_converted rc ON rm.user_id = rc.referrer_email
GROUP BY 1
ORDER BY 1 DESC;
-- 2. Top Referrers by Revenue Impact
SELECT
rm.referrer_email,
COUNT(DISTINCT rm.user_id) AS total_referrals,
COUNT(DISTINCT rc.user_id) AS converted_referrals,
SUM(p.amount) AS total_referred_revenue,
SUM(p.amount) / NULLIF(COUNT(DISTINCT rc.user_id), 0) AS avg_revenue_per_referral
FROM referral_made rm
LEFT JOIN referral_converted rc ON rm.user_id = rc.user_id
LEFT JOIN purchases p ON rc.user_id = p.user_id
GROUP BY 1
ORDER BY total_referred_revenue DESC
LIMIT 20;
-- 3. Referred vs Organic Customer LTV Comparison
SELECT
CASE WHEN i.referred_by IS NOT NULL THEN 'Referred' ELSE 'Organic' END AS acquisition_type,
COUNT(DISTINCT i.user_id) AS customers,
AVG(customer_ltv.total_spend) AS avg_ltv,
AVG(customer_ltv.months_active) AS avg_months_active
FROM identifies i
LEFT JOIN (
SELECT user_id, SUM(amount) AS total_spend,
COUNT(DISTINCT DATE_TRUNC('month', timestamp)) AS months_active
FROM purchases GROUP BY 1
) customer_ltv ON i.user_id = customer_ltv.user_id
GROUP BY 1;Partition your referral event tables by date (month or day) in your warehouse. Referral analytics queries almost always filter by time period, and partitioned tables dramatically reduce query costs and execution time β especially as your data grows.
Join your referral events into a single denormalized table that combines the referrer, referred person, conversion status, and revenue. This makes ad-hoc querying much faster and simpler than joining multiple tables every time you run an analysis.
If you use dbt for data transformation, create incremental models for your referral tables. This processes only new events since the last run, keeping transformation costs low even as your referral program scales to millions of events.
Referral events are relatively low-volume compared to product usage events. A program generating 1,000 referrals/month would add roughly 10-50 MB of warehouse data per month. Even at scale, referral data is a tiny fraction of your overall warehouse storage.
Segment syncs to most warehouses every 1-2 hours by default. For BigQuery and Snowflake, you can configure more frequent syncing. This means warehouse-based referral dashboards have a 1-2 hour lag β use Segment's real-time destinations (like Mixpanel or Amplitude) for up-to-the-minute metrics.
Yes. Use GrowSurf's API to export historical participant and referral data, then use Segment's HTTP Source API or direct warehouse loading to backfill historical events. Structure the backfilled data to match your Segment event schema so it integrates seamlessly with ongoing event data.
Trusted by marketing and product teams at fast-growing B2C, fintech, and SaaS companies
