Some brands have multiple event sources (GA4, Snowplow, Heap, Elevar, Blotout, etc.). Those sources can record the same touchpoint within seconds of each other, creating the risk of double-counting. For multi-source customers, SourceMedium uses deduplication rules to merge streams into a canonical view that preserves richness while preventing over-counting.Documentation Index
Fetch the complete documentation index at: https://docs.sourcemedium.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
High-level approach
For multi-source data, we consider touchpoints duplicates when core attributes match within a small time window. The dedupe fingerprint typically includes:- Timestamp (rounded to seconds, in UTC)
- Standardized event name
- UTM source / medium / campaign (with NULL normalized)
utm_content, utm_term, and click IDs are preserved as enrichment but don’t always participate in the fingerprint (so you can still do creative-level analysis without inflating counts).
Where this shows up
These patterns are most relevant when you query MTA / journey tables, like:your_project.sm_experimental.obt_purchase_journeys_with_mta_models
Recommended analysis patterns
Verify you’re not double-counting purchases
Compare touchpoint volumes by source system
Single-source customers should remain unchanged; multi-source dedupe is only applied when multiple sources exist for the same order/journey.

