fct_order_attribution_signals

version: 2

models:
  - name: fct_order_attribution_signals
    description: >
      Row-level order attribution signals table for custom modeling, debugging, and
      traceability. Grain: one row per (sm_store_id, sm_order_key, evidence_type,
      evidence_source, evidence_row_id). Date field: order_processed_at_local_datetime.
      Critical filters: evidence_type and evidence_source for isolating traffic,
      zero-party, or discount signals; sm_utm_final_source_priority = 1 for the
      primary traffic winner. Key joins: dim_orders via sm_order_key (many:1).
    columns:
      - name: sm_order_attribution_signal_key
        description: >
          Stable surrogate key for the signal row. Unique per evidence record.

      - name: sm_store_id
        description: >
          SourceMedium's unique store identifier. For Shopify stores, derived from the
          myshopify.com domain; for other platforms (Amazon, TikTok Shop, Walmart.com),
          uses platform-specific identifiers.

      - name: source_system
        description: >
          Originating platform for the record (e.g., Shopify, Amazon, TikTok Shop,
          Chargebee). Used for platform‑specific behavior and coverage.

      - name: sm_order_key
        description: >
          Stable SourceMedium order key. Unique per order. Key joins:
          `obt_order_lines` (1:many via `sm_order_key`); `dim_customers` (many:1 via
          `sm_customer_key`). Platform caveat: TikTok Shop coverage may be limited.
          Multiple rows per order are expected in this model because each order can
          keep several attribution signals.

      - name: order_id
        description: >
          Platform order identifier. Not globally unique across stores; pair with
          `sm_store_id` and `source_system` when needed for scoping.

      - name: order_processed_at
        description: >
          UTC timestamp when the order was processed.

      - name: order_processed_at_local_datetime
        description: >
          Order processed timestamp converted to reporting timezone (from
          order_processed_at UTC). Primary date field for order analytics and
          time-based filtering.

      - name: evidence_type
        description: >
          High-level class of evidence on the row. Current values are
          `traffic_source_candidate`, `zero_party_candidate`, and
          `order_discount_code`. `traffic_source_candidate` includes both primary
          traffic candidates and MTA supporting-context rows.

      - name: evidence_source
        description: >
          Specific source that produced the evidence row, such as
          `shopify_custom_attribute_override`, `website_event_tracking_purchase`,
          `pps_response`, or `order_discount_code`.

      - name: evidence_row_id
        description: >
          Source-specific identifier for the evidence record within an order. Examples
          include `shopify_landing_site:1`, `ga4:2`,
          `segment_website_event_tracking:evt_123`, an MTA touch id, or a normalized
          discount code such as `welcome10`.

      - name: sm_utm_source
        description: >
          Canonicalized source value for traffic evidence after SourceMedium inference
          and normalization. Null for non-traffic rows.

      - name: sm_utm_medium
        description: >
          Canonicalized medium value for traffic evidence after SourceMedium inference
          and normalization. Null for non-traffic rows.

      - name: sm_utm_campaign
        description: >
          Canonicalized campaign value for the traffic evidence row. Null for
          non-traffic rows.

      - name: sm_utm_content
        description: >
          Canonicalized content value for the traffic evidence row. Null for
          non-traffic rows.

      - name: sm_utm_term
        description: >
          Canonicalized term value for the traffic evidence row. Null for non-traffic
          rows.

      - name: sm_utm_id
        description: >
          Canonicalized `utm_id` value for the traffic evidence row. Null for
          non-traffic rows.

      - name: sm_utm_source_medium
        description: >
          Combined canonical source / medium pair used by SourceMedium channel mapping
          and attribution analysis.

      - name: raw_utm_source
        description: >
          Raw source captured on the evidence row before inference and normalization.

      - name: raw_utm_medium
        description: >
          Raw medium captured on the evidence row before inference and normalization.

      - name: raw_utm_campaign
        description: >
          Raw campaign captured on the evidence row before inference and normalization.

      - name: raw_utm_content
        description: >
          Raw content captured on the evidence row before inference and normalization.

      - name: raw_utm_term
        description: >
          Raw term captured on the evidence row before inference and normalization.

      - name: raw_utm_id
        description: >
          Raw `utm_id` captured on the evidence row before inference and normalization.

      - name: raw_referrer
        description: >
          Raw referrer value captured with the evidence row before canonicalization.

      - name: raw_landing_page_url
        description: >
          Raw landing page URL captured with the evidence row when available.

      - name: raw_referring_site
        description: >
          Raw referring-site URL captured from the order record when available.

      - name: sm_attribution_inference_method
        description: >
          How SourceMedium produced the canonicalized attribution fields on the row.
          Current values include `raw_utm`, `referrer_domain_inferred`,
          `gclid_inferred`, `fbclid_inferred`, `fbclid_inferred_both_present`,
          `mta_touch`, and `not_applicable`.

      - name: sm_utm_final_source_priority
        description: >
          Per-order rank among `traffic_source_candidate` rows. A value of `1` means
          this row is the winning primary traffic source for the order. Higher values
          are lower-ranked fallbacks. Null means the row is not part of the primary
          traffic ranking. Values can be sparse, such as `1`, `3`, `4`, `5`, `7`, and
          `8`.

      - name: attribution_signal_raw
        description: >
          Raw non-UTM attribution signal captured on the row, such as a post-purchase
          survey answer, zero-party tag, or discount code. Null for standard traffic
          rows.

      - name: attribution_signal_parsed
        description: >
          Parsed or normalized version of `attribution_signal_raw` used for grouped
          analysis. Null for standard traffic rows.

      - name: attribution_signal_type
        description: >
          Type of non-traffic signal on the row. Current values are `zero_party` and
          `discount_code`.

      - name: sm_marketing_channel
        description: >
          Standardized SourceMedium marketing channel mapping for traffic evidence after
          canonicalization. Null for non-traffic rows.

      - name: _shard_id
        description: >
          Internal shard identifier used for sharded model execution and debugging. Not
          intended for customer-facing analysis.

fct_order_attribution_signals is the customer-facing row-level table for custom attribution modeling, order-level traceability, and debugging. It keeps the winning traffic source and the supporting signal rows side by side so you can rebuild SourceMedium’s rules, inspect explicit overrides, and trace source hierarchy without dropping into raw staging models.

Use this table when you need to answer “why did this order end up attributed this way?” rather than “how much revenue did a channel get?” For aggregated attribution reporting, stay in the report and MTA tables.

What this table is for

Trace every evidence row that SourceMedium considered for a single order.
See the difference between raw captured values and canonicalized sm_utm_* values.
Identify which traffic source won the primary ranking for each order.
Inspect zero-party and discount-code context without leaving the order-grain workflow.

Current evidence model

evidence_type tells you what class of evidence you are looking at:

`evidence_type`	What it means	Typical sources
`traffic_source_candidate`	Marketing-source evidence that can contribute to the primary traffic winner or provide traffic context.	`shopify_custom_attribute_override`, `shopify_landing_site`, `shopify_note`, `website_event_tracking_purchase`, `google_analytics_transaction`, `shopify_order_referring_site_utms`, `mta_first_touch`, `mta_last_touch`
`zero_party_candidate`	Self-reported or tagged attribution context.	`pps_response`, `order_tags_zero_party`, `customer_tags_zero_party`
`order_discount_code`	Order-level discount-code evidence.	`order_discount_code`

mta_first_touch and mta_last_touch use evidence_type = traffic_source_candidate, but they are supporting-context rows. They always have sm_utm_final_source_priority = NULL, and an order can have at most one row of each type when that order appears in MTA journey data.

How primary traffic ranking works

SourceMedium keeps multiple traffic candidates on the order, then assigns a per-order rank in sm_utm_final_source_priority.

1 means “this was the winning primary traffic source for the order.”
Higher numbers are lower-ranked fallbacks kept for debugging.
NULL means the row is contextual rather than part of the primary traffic ranking.

For the current customer-facing evidence sources, the preferred traffic-source order is:

shopify_custom_attribute_override
shopify_landing_site
shopify_note
website_event_tracking_purchase
google_analytics_transaction
shopify_order_referring_site_utms

The numeric priority values in the table are ordering signals, not a reusable source code system. Some numbers are skipped, so you might see values such as 1, 3, 4, 5, 7, and 8 instead of a compact 1..6 sequence. The relative ordering matters, not the specific numeric values.

mta_first_touch and mta_last_touch rows are kept in the signals table as traffic context, but they do not compete for the primary traffic winner. They always have sm_utm_final_source_priority = NULL, and an order will have at most one of each when the order appears in MTA journey data.

Column guide

Identifiers

sm_order_attribution_signal_key: Stable key for the evidence row.
sm_store_id: Customer-facing store identifier for tenant scoping.
source_system: Commerce platform that produced the order record.
sm_order_key: Stable SourceMedium order key for joins back to order-grain tables.
order_id: Platform-native order id for stakeholder-friendly lookups.

Timestamps

order_processed_at: UTC order processed timestamp.
order_processed_at_local_datetime: Reporting-timezone datetime for customer-facing analysis and date filters.

Evidence classification

evidence_type: Broad class of evidence, such as traffic_source_candidate, zero_party_candidate, or order_discount_code.
evidence_source: The specific capture mechanism or upstream source.
evidence_row_id: Source-specific identifier that distinguishes one evidence record from another on the same order. Examples include shopify_landing_site:1, ga4:2, segment_website_event_tracking:evt_123, an MTA touch id, or a normalized discount code such as welcome10.

Canonicalized attribution fields

sm_utm_source, sm_utm_medium, sm_utm_campaign, sm_utm_content, sm_utm_term, sm_utm_id: Final normalized traffic fields for the row.
sm_utm_source_medium: Canonical source / medium pair that usually makes debugging faster.
sm_marketing_channel: Final channel grouping for traffic rows after normalization. See Channel Mapping if the final grouped value is what you need to debug.

Raw captured fields

raw_utm_source, raw_utm_medium, raw_utm_campaign, raw_utm_content, raw_utm_term, raw_utm_id: Values captured before inference or cleaning.
raw_referrer, raw_landing_page_url, raw_referring_site: Supporting raw URL context that often explains (none) and (other) outcomes.

Inference metadata

sm_attribution_inference_method: Explains how SourceMedium derived the canonicalized values.
sm_utm_final_source_priority: Per-order ordering among traffic candidates. 1 is the winner for that order.

Zero-party and discount signals

attribution_signal_raw: Raw non-traffic signal captured on the row.
attribution_signal_parsed: Parsed or normalized version of that non-traffic signal.
attribution_signal_type: Non-traffic signal category, currently zero_party or discount_code.

Inference methods you will see

sm_attribution_inference_method explains how SourceMedium produced the canonicalized fields on the row.

Method	What it means
`raw_utm`	Source and medium were taken directly from captured UTM values.
`referrer_domain_inferred`	The source came from a referrer-domain fallback when direct UTM source was missing.
`gclid_inferred`	Google click-id logic inferred a paid Google source / medium.
`fbclid_inferred`	Meta click-id logic inferred a paid Meta source / medium.
`fbclid_inferred_both_present`	Both Meta and Google click ids were present, and Meta won the inference rule.
`mta_touch`	The row came from MTA touch context rather than the primary UTM ranking workflow.
`not_applicable`	The row is non-traffic evidence such as zero-party or discount-code context.

Common debugging workflows

Trace a single order

Use this when a stakeholder asks “show me everything SourceMedium knew about this order.”

SELECT
  order_id,
  evidence_type,
  evidence_source,
  evidence_row_id,
  sm_utm_final_source_priority,
  sm_attribution_inference_method,
  sm_utm_source_medium,
  sm_marketing_channel,
  attribution_signal_type
FROM `your_project.sm_transformed_v2.fct_order_attribution_signals`
WHERE sm_store_id = 'your-sm_store_id'
  AND order_id = 'ORDER_ID_HERE'
ORDER BY
  evidence_type,
  CASE WHEN sm_utm_final_source_priority IS NULL THEN 1 ELSE 0 END,
  sm_utm_final_source_priority,
  evidence_source,
  evidence_row_id;

Why did this order become `(none)` or `(other)`?

Start with the priority-1 traffic row, then compare the raw captured fields against the canonicalized fields and the final channel mapping.

SELECT
  evidence_source,
  sm_utm_final_source_priority,
  sm_attribution_inference_method,
  raw_utm_source,
  raw_utm_medium,
  raw_referrer,
  raw_landing_page_url,
  sm_utm_source,
  sm_utm_medium,
  sm_utm_source_medium,
  sm_marketing_channel
FROM `your_project.sm_transformed_v2.fct_order_attribution_signals`
WHERE sm_store_id = 'your-sm_store_id'
  AND order_id = 'ORDER_ID_HERE'
  AND evidence_type = 'traffic_source_candidate'
ORDER BY
  CASE WHEN sm_utm_final_source_priority IS NULL THEN 1 ELSE 0 END,
  sm_utm_final_source_priority,
  evidence_source;

Which evidence source usually wins?

This is the fastest way to understand which capture mechanism is actually driving the primary traffic winner for a store.

SELECT
  evidence_source,
  COUNT(DISTINCT sm_order_key) AS winning_orders
FROM `your_project.sm_transformed_v2.fct_order_attribution_signals`
WHERE sm_store_id = 'your-sm_store_id'
  AND evidence_type = 'traffic_source_candidate'
  AND sm_utm_final_source_priority = 1
  AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
GROUP BY 1
ORDER BY 2 DESC;

What non-traffic signals exist for this order?

Use this to separate survey, tag, and discount-code context from the primary traffic ranking.

SELECT
  evidence_type,
  evidence_source,
  evidence_row_id,
  attribution_signal_type,
  attribution_signal_raw,
  attribution_signal_parsed
FROM `your_project.sm_transformed_v2.fct_order_attribution_signals`
WHERE sm_store_id = 'your-sm_store_id'
  AND order_id = 'ORDER_ID_HERE'
  AND evidence_type IN ('zero_party_candidate', 'order_discount_code')
ORDER BY evidence_type, evidence_source, evidence_row_id;

Attribution Source Hierarchy

Learn how SourceMedium orders the traffic candidates that feed this signals table.

MTA Models Reference

See how row-level order auditing complements purchase-journey and MTA reporting.

Attribution Health

Move from a store-level attribution-health alert into order-level debugging.

dim_orders

Join back to order-level attributes once you identify the evidence row you care about.

​What this table is for

​Current evidence model

​How primary traffic ranking works

​Column guide

​Identifiers

​Timestamps

​Evidence classification

​Canonicalized attribution fields

​Raw captured fields

​Inference metadata

​Zero-party and discount signals

​Inference methods you will see

​Common debugging workflows

​Trace a single order

​Why did this order become (none) or (other)?

​Which evidence source usually wins?

​What non-traffic signals exist for this order?

​Related workflows

Attribution Source Hierarchy

MTA Models Reference

Attribution Health

dim_orders

What this table is for

Current evidence model

How primary traffic ranking works

Column guide

Identifiers

Timestamps

Evidence classification

Canonicalized attribution fields

Raw captured fields

Inference metadata

Zero-party and discount signals

Inference methods you will see

Common debugging workflows

Trace a single order

Why did this order become `(none)` or `(other)`?

Which evidence source usually wins?

What non-traffic signals exist for this order?

Related workflows