Context
PromptFlow, a SaaS product analytics company, wants a unified event pipeline to measure user behavior originating from ChatGPT, Claude, and Gemini referrals. Today, referral traffic is captured inconsistently across web SDK events, backend API logs, and CRM attribution tables, making it difficult to answer basic questions such as referral conversion, downstream retention, and duplicate session counts.
You need to design the event schema and pipeline that standardizes referral tracking across these sources and lands analytics-ready data in the warehouse with low latency.
Scale Requirements
- Traffic: 120K events/second peak, 25K average
- Sources: Web SDK, mobile app events, backend API logs, UTM parameters, CRM lead updates
- Event size: 1.5-3 KB JSON per event
- Latency target: < 3 minutes from event generation to warehouse availability
- Daily volume: ~2.5B events/day, ~5 TB raw JSON/day
- Retention: 180 days raw, 3 years curated attribution tables
Requirements
- Define a canonical event schema for referral attribution across ChatGPT, Claude, and Gemini, including source, medium, campaign, session, user, device, and downstream conversion fields.
- Support schema evolution without breaking downstream consumers; versioning must be explicit.
- Ingest events from browser, mobile, and backend systems into a single streaming pipeline.
- Deduplicate events across client-side and server-side emitters using deterministic keys.
- Enrich events with identity resolution, sessionization, and referral classification logic.
- Produce warehouse tables for raw events, conformed referral events, sessions, and attribution facts.
- Implement data quality checks for missing referral metadata, malformed payloads, and source misclassification.
- Design orchestration, monitoring, replay, and backfill strategies.
Constraints
- Cloud environment is AWS; existing warehouse is Snowflake
- Team has strong SQL/dbt skills but limited Flink expertise
- Incremental infrastructure budget is capped at $30K/month
- Must support GDPR/CCPA deletion requests within 72 hours
- Event producers across product teams cannot all migrate simultaneously, so backward compatibility is required