> ## Documentation Index
> Fetch the complete documentation index at: https://docs.sourcemedium.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Attribution & Data Health

> Tracking quality, attribution coverage, and data health queries.

<Tip>
  Use the same query standards from the [SQL Query Library overview](/data-activation/template-resources/sql-query-library): `is_order_sm_valid = TRUE` for order analyses, `sm_store_id` scoping for multi-store setups, and `your_project` placeholders.
</Tip>

These templates help you assess attribution coverage and basic data health before doing deeper analysis.

If you want table-level freshness/coverage metadata, start with: [`dim_data_dictionary`](/data-activation/data-tables/sm_metadata/dim_data_dictionary).

<AccordionGroup>
  <Accordion title="Which tables are stale or missing data?">
    **What you'll learn:** Which tables in your data warehouse haven't been updated recently or are missing data entirely. Run this first to identify pipeline issues before diving into analysis.

    ```sql theme={null}
    -- Assumptions: timeframe=all_time | metric=table_freshness | grain=dataset+table | scope=sm_metadata
    SELECT
      dataset_name,
      table_name,
      MAX(CAST(table_has_data AS INT64)) > 0 AS table_has_data,
      MAX(CAST(table_has_fresh_data_14d AS INT64)) > 0 AS table_has_fresh_data_14d,
      MAX(table_last_data_date) AS table_last_data_date,
      ANY_VALUE(table_description) AS table_description
    FROM `your_project.sm_metadata.dim_data_dictionary`
    WHERE dataset_name IN ('sm_transformed_v2', 'sm_experimental')
      AND dataset_name IS NOT NULL
      AND table_name IS NOT NULL
    GROUP BY 1, 2
    ORDER BY table_has_fresh_data_14d ASC, table_has_data ASC, table_last_data_date ASC, dataset_name, table_name
    LIMIT 200;
    ```
  </Accordion>

  <Accordion title="Attribution column coverage on orders">
    **What you'll learn:** How complete your attribution data is—what percentage of orders have UTM source, zero-party attribution, discount codes, landing pages, and referrer domains. Low coverage in key columns signals tracking gaps.

    ```sql theme={null}
    -- Assumptions: timeframe=all_time | metric=column_coverage | grain=column | scope=sm_metadata_obt_orders
    WITH cols AS (
      SELECT 'sm_utm_source' AS column_name UNION ALL
      SELECT 'sm_utm_medium' UNION ALL
      SELECT 'sm_utm_source_medium' UNION ALL
      SELECT 'sm_zero_party_attribution_source' UNION ALL
      SELECT 'order_discount_codes_csv' UNION ALL
      SELECT 'sm_order_landing_page' UNION ALL
      SELECT 'sm_order_referrer_domain'
    )
    SELECT
      d.table_name,
      d.column_name,
      ROUND(100 - d.column_null_percentage, 1) AS non_null_pct,
      d.column_distinct_count,
      (
        SELECT STRING_AGG(
          CONCAT(v.value, ' (', FORMAT('%.1f', v.pct), '%)'),
          ', '
          ORDER BY v.pct DESC
          LIMIT 8
        )
        FROM UNNEST(IFNULL(d.categorical_value_distribution, [])) AS v
        WHERE v.value IS NOT NULL AND v.pct IS NOT NULL
      ) AS top_values
    FROM `your_project.sm_metadata.dim_data_dictionary` d
    INNER JOIN cols c
      ON d.column_name = c.column_name
    WHERE d.dataset_name = 'sm_transformed_v2'
      AND d.table_name = 'obt_orders'
    ORDER BY non_null_pct DESC, d.column_distinct_count DESC, d.column_name;
    ```
  </Accordion>

  <Accordion title="When UTMs are missing, what other attribution signals exist?">
    **What you'll learn:** For orders without UTM tracking, what fallback attribution data is available (zero-party surveys, discount codes, landing pages, referrer domains). Helps you understand how much attribution you can recover.

    ```sql theme={null}
    -- Assumptions: timeframe=last_90_days | metric=fallback_coverage | grain=overall | scope=valid_orders_only_missing_utms
    WITH missing_utms AS (
      SELECT
        sm_order_key,
        order_net_revenue,
        sm_zero_party_attribution_source,
        order_discount_codes_csv,
        sm_order_landing_page,
        sm_order_referrer_domain
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
        AND COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') = '(none) / (none)'
    )
    SELECT
      COUNT(DISTINCT sm_order_key) AS orders_missing_utms,
      SUM(order_net_revenue) AS order_net_revenue_missing_utms,
      COUNTIF(sm_zero_party_attribution_source IS NOT NULL AND TRIM(sm_zero_party_attribution_source) NOT IN ('', '(none)')) AS orders_with_zero_party,
      SAFE_DIVIDE(
        COUNTIF(sm_zero_party_attribution_source IS NOT NULL AND TRIM(sm_zero_party_attribution_source) NOT IN ('', '(none)')),
        NULLIF(COUNT(DISTINCT sm_order_key), 0)
      ) AS pct_with_zero_party,
      COUNTIF(order_discount_codes_csv IS NOT NULL AND TRIM(order_discount_codes_csv) NOT IN ('', '(none)')) AS orders_with_discount_code,
      SAFE_DIVIDE(
        COUNTIF(order_discount_codes_csv IS NOT NULL AND TRIM(order_discount_codes_csv) NOT IN ('', '(none)')),
        NULLIF(COUNT(DISTINCT sm_order_key), 0)
      ) AS pct_with_discount_code,
      COUNTIF(sm_order_landing_page IS NOT NULL AND TRIM(sm_order_landing_page) NOT IN ('', '(none)')) AS orders_with_landing_page,
      SAFE_DIVIDE(
        COUNTIF(sm_order_landing_page IS NOT NULL AND TRIM(sm_order_landing_page) NOT IN ('', '(none)')),
        NULLIF(COUNT(DISTINCT sm_order_key), 0)
      ) AS pct_with_landing_page,
      COUNTIF(sm_order_referrer_domain IS NOT NULL AND TRIM(sm_order_referrer_domain) NOT IN ('', '(none)')) AS orders_with_referrer_domain,
      SAFE_DIVIDE(
        COUNTIF(sm_order_referrer_domain IS NOT NULL AND TRIM(sm_order_referrer_domain) NOT IN ('', '(none)')),
        NULLIF(COUNT(DISTINCT sm_order_key), 0)
      ) AS pct_with_referrer_domain
    FROM missing_utms;
    ```
  </Accordion>

  <Accordion title="Top referrer domains for orders missing UTMs">
    **What you'll learn:** Which external sites are sending you traffic that isn't tagged with UTMs. Use this to identify partners, affiliates, or other untracked sources that need proper tracking or attribution rules.

    ```sql theme={null}
    -- Assumptions: timeframe=last_90_days | metric=orders+net_revenue | grain=referrer_domain | scope=valid_orders_only_missing_utms
    WITH base AS (
      SELECT
        LOWER(TRIM(sm_order_referrer_domain)) AS referrer_domain,
        sm_order_key,
        order_net_revenue
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
        AND COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') = '(none) / (none)'
        AND sm_order_referrer_domain IS NOT NULL
        AND TRIM(sm_order_referrer_domain) NOT IN ('', '(none)')
    )
    SELECT
      referrer_domain,
      COUNT(DISTINCT sm_order_key) AS orders,
      SUM(order_net_revenue) AS order_net_revenue
    FROM base
    GROUP BY 1
    HAVING orders >= 25
    ORDER BY order_net_revenue DESC
    LIMIT 50;
    ```
  </Accordion>

  <Accordion title="Key join-key completeness (customers + SKU coverage)">
    **What you'll learn:** The percentage of orders missing customer keys and order lines missing SKUs. Critical for data integrity—high null rates here break customer-level analysis and product reporting.

    ```sql theme={null}
    -- Assumptions: timeframe=last_30_days | metric=null_rate_checks | grain=overall | scope=valid_orders_only
    WITH orders AS (
      SELECT
        COUNT(*) AS orders_total,
        COUNTIF(sm_customer_key IS NULL) AS orders_missing_customer_key
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
    ),
    lines AS (
      SELECT
        COUNT(*) AS lines_total,
        COUNTIF(sku IS NULL OR TRIM(sku) = '' OR LOWER(sku) = 'missing sku') AS lines_missing_sku
      FROM `your_project.sm_transformed_v2.obt_order_lines`
      WHERE is_order_sm_valid = TRUE
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
    )
    SELECT
      orders_total,
      orders_missing_customer_key,
      SAFE_DIVIDE(orders_missing_customer_key, NULLIF(orders_total, 0)) AS pct_orders_missing_customer_key,
      lines_total,
      lines_missing_sku,
      SAFE_DIVIDE(lines_missing_sku, NULLIF(lines_total, 0)) AS pct_lines_missing_sku
    FROM orders
    CROSS JOIN lines;
    ```
  </Accordion>

  <Accordion title="Attribution health trend (weekly)">
    **What you'll learn:** How your attribution coverage has changed week-over-week—UTM coverage, unattributed orders, and direct traffic share. Spot tracking regressions or improvements over time.

    ```sql theme={null}
    -- Assumptions: timeframe=last_26_weeks | metric=utm_coverage+direct_share+unattributed_share | grain=week | scope=valid_orders_only
    WITH base AS (
      SELECT
        DATE_TRUNC(DATE(order_processed_at_local_datetime), WEEK(MONDAY)) AS week_start,
        COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium,
        sm_order_key,
        order_net_revenue
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 182 DAY)
    ),
    weekly AS (
      SELECT
        week_start,
        COUNT(DISTINCT sm_order_key) AS orders,
        SUM(order_net_revenue) AS order_net_revenue,
        COUNTIF(source_medium = '(none) / (none)') AS unattributed_orders,
        SUM(CASE WHEN source_medium = '(none) / (none)' THEN order_net_revenue ELSE 0 END) AS unattributed_revenue,
        COUNTIF(source_medium IN ('(direct) / (none)', 'direct / (none)')) AS direct_orders,
        SUM(CASE WHEN source_medium IN ('(direct) / (none)', 'direct / (none)') THEN order_net_revenue ELSE 0 END) AS direct_revenue,
        COUNTIF(source_medium != '(none) / (none)') AS orders_with_utm_source_medium
      FROM base
      GROUP BY 1
    )
    SELECT
      week_start,
      orders,
      order_net_revenue,
      SAFE_DIVIDE(orders_with_utm_source_medium, NULLIF(orders, 0)) AS pct_orders_with_utm_source_medium,
      SAFE_DIVIDE(unattributed_orders, NULLIF(orders, 0)) AS pct_orders_unattributed,
      SAFE_DIVIDE(unattributed_revenue, NULLIF(order_net_revenue, 0)) AS pct_revenue_unattributed,
      SAFE_DIVIDE(direct_orders, NULLIF(orders, 0)) AS pct_orders_direct,
      SAFE_DIVIDE(direct_revenue, NULLIF(order_net_revenue, 0)) AS pct_revenue_direct
    FROM weekly
    ORDER BY week_start;
    ```
  </Accordion>

  <Accordion title="Attribution health by store and sales channel">
    **What you'll learn:** How attribution coverage varies across your stores and sales channels (online vs. POS vs. wholesale). Some channels naturally have lower attribution—this helps set expectations.

    ```sql theme={null}
    -- Assumptions: timeframe=last_30_days | metric=unattributed_share | grain=sm_store_id+sm_channel | scope=valid_orders_only
    WITH base AS (
      SELECT
        sm_store_id,
        COALESCE(NULLIF(LOWER(TRIM(sm_channel)), ''), '(unknown)') AS sm_channel,
        COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium,
        sm_order_key,
        order_net_revenue
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
        AND NULLIF(LOWER(TRIM(sm_channel)), '') IS NOT NULL
    )
    SELECT
      sm_store_id,
      sm_channel,
      COUNT(DISTINCT sm_order_key) AS orders,
      SUM(order_net_revenue) AS order_net_revenue,
      SAFE_DIVIDE(COUNTIF(source_medium = '(none) / (none)'), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_orders_unattributed,
      SAFE_DIVIDE(
        SUM(CASE WHEN source_medium = '(none) / (none)' THEN order_net_revenue ELSE 0 END),
        NULLIF(SUM(order_net_revenue), 0)
      ) AS pct_revenue_unattributed
    FROM base
    GROUP BY 1, 2
    HAVING orders >= 50
    ORDER BY pct_revenue_unattributed DESC, orders DESC
    LIMIT 100;
    ```
  </Accordion>

  <Accordion title="Discount code parsing (top codes by revenue)">
    **What you'll learn:** Your top discount codes ranked by revenue, with order counts and AOV. Use this to evaluate promo effectiveness and identify codes that might be over-used or under-attributed.

    ```sql theme={null}
    -- Assumptions: timeframe=last_90_days | metric=orders+net_revenue | grain=discount_code | scope=valid_orders_only
    -- Note: If an order has multiple discount codes, its revenue will be counted under each code (this is a code-usage view, not strict attribution).
    WITH orders_with_codes AS (
      SELECT
        sm_order_key,
        order_net_revenue,
        order_discount_codes_csv
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
        AND order_discount_codes_csv IS NOT NULL
        AND TRIM(order_discount_codes_csv) != ''
    ),
    exploded AS (
      SELECT
        sm_order_key,
        order_net_revenue,
        TRIM(code_raw) AS discount_code
      FROM orders_with_codes,
      UNNEST(SPLIT(order_discount_codes_csv, ',')) AS code_raw
      WHERE TRIM(code_raw) != ''
    )
    SELECT
      discount_code,
      COUNT(DISTINCT sm_order_key) AS orders,
      SUM(order_net_revenue) AS order_net_revenue,
      SAFE_DIVIDE(SUM(order_net_revenue), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS aov
    FROM exploded
    GROUP BY 1
    HAVING orders >= 25
    ORDER BY order_net_revenue DESC
    LIMIT 50;
    ```
  </Accordion>

  <Accordion title="Top landing pages for orders missing UTMs">
    **What you'll learn:** Which pages customers land on when they arrive without UTM tracking. Useful for identifying untracked entry points and pages that need better tracking implementation.

    ```sql theme={null}
    -- Assumptions: timeframe=last_90_days | metric=orders+net_revenue | grain=landing_host+landing_path | scope=valid_orders_only_missing_utms
    WITH base AS (
      SELECT
        sm_order_key,
        order_net_revenue,
        sm_order_landing_page
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
        AND COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') = '(none) / (none)'
        AND sm_order_landing_page IS NOT NULL
        AND TRIM(sm_order_landing_page) NOT IN ('', '(none)')
    ),
    parsed AS (
      SELECT
        sm_order_key,
        order_net_revenue,
        REGEXP_EXTRACT(sm_order_landing_page, r'^(?:https?://)?([^/?#]+)') AS landing_host,
        REGEXP_EXTRACT(sm_order_landing_page, r'^(?:https?://)?[^/?#]+(/[^?#]*)') AS landing_path
      FROM base
    )
    SELECT
      COALESCE(NULLIF(LOWER(TRIM(landing_host)), ''), '(unknown)') AS landing_host,
      COALESCE(NULLIF(landing_path, ''), '/') AS landing_path,
      COUNT(DISTINCT sm_order_key) AS orders,
      SUM(order_net_revenue) AS order_net_revenue
    FROM parsed
    GROUP BY 1, 2
    HAVING orders >= 25
    ORDER BY order_net_revenue DESC
    LIMIT 50;
    ```
  </Accordion>

  <Accordion title="Click-id coverage vs UTM coverage (gclid/fbclid)">
    **What you'll learn:** The overlap between UTM tracking and ad-platform click IDs (gclid for Google, fbclid for Meta). Reveals orders where click IDs exist but UTMs don't—potential attribution recovery opportunities.

    ```sql theme={null}
    -- Assumptions: timeframe=last_90_days | metric=utm_coverage+click_id_coverage | grain=week | scope=valid_orders_only
    WITH base AS (
      SELECT
        DATE_TRUNC(DATE(order_processed_at_local_datetime), WEEK(MONDAY)) AS week_start,
        sm_order_key,
        order_net_revenue,
        COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium,
        sm_gclid,
        sm_fbclid
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
    )
    SELECT
      week_start,
      COUNT(DISTINCT sm_order_key) AS orders,
      SUM(order_net_revenue) AS order_net_revenue,
      COUNTIF(source_medium != '(none) / (none)') AS orders_with_utm_source_medium,
      SAFE_DIVIDE(COUNTIF(source_medium != '(none) / (none)'), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_orders_with_utm_source_medium,
      COUNTIF(sm_gclid IS NOT NULL AND sm_gclid NOT IN ('', '(none)')) AS orders_with_gclid,
      SAFE_DIVIDE(COUNTIF(sm_gclid IS NOT NULL AND sm_gclid NOT IN ('', '(none)')), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_orders_with_gclid,
      COUNTIF(sm_fbclid IS NOT NULL AND sm_fbclid NOT IN ('', '(none)')) AS orders_with_fbclid,
      SAFE_DIVIDE(COUNTIF(sm_fbclid IS NOT NULL AND sm_fbclid NOT IN ('', '(none)')), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_orders_with_fbclid,
      COUNTIF(source_medium = '(none) / (none)' AND sm_gclid IS NOT NULL AND sm_gclid NOT IN ('', '(none)')) AS utm_missing_but_gclid_orders,
      COUNTIF(source_medium = '(none) / (none)' AND sm_fbclid IS NOT NULL AND sm_fbclid NOT IN ('', '(none)')) AS utm_missing_but_fbclid_orders
    FROM base
    GROUP BY 1
    ORDER BY week_start;
    ```
  </Accordion>

  <Accordion title="Which attribution evidence source is winning?">
    **What you'll learn:** Which capture mechanism is most often selected as the primary traffic source for valid orders. Use this to see whether attribution is coming from checkout attributes, landing-site parsing, website events, GA transactions, or other traffic evidence.

    ```sql theme={null}
    -- Assumptions: timeframe=last_30_days | metric=winning_orders | grain=evidence_source+inference_method+channel | scope=valid_orders_only
    SELECT
      s.evidence_source,
      s.sm_attribution_inference_method,
      COALESCE(NULLIF(LOWER(TRIM(s.sm_marketing_channel)), ''), '(unknown)') AS sm_marketing_channel,
      COUNT(DISTINCT s.sm_order_key) AS winning_orders
    FROM `your_project.sm_transformed_v2.fct_order_attribution_signals` AS s
    INNER JOIN `your_project.sm_transformed_v2.obt_orders` AS o
      ON s.sm_order_key = o.sm_order_key
      AND s.sm_store_id = o.sm_store_id
    WHERE o.is_order_sm_valid = TRUE
      AND o.order_cancelled_at IS NULL
      AND DATE(o.order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
      AND s.evidence_type = 'traffic_source_candidate'
      AND s.sm_utm_final_source_priority = 1
    GROUP BY 1, 2, 3
    ORDER BY winning_orders DESC;
    ```
  </Accordion>

  <Accordion title="Orders missing UTMs but carrying fallback attribution signals">
    **What you'll learn:** For valid orders where the final source/medium is missing, which zero-party, discount-code, or lower-level evidence rows are still available in `fct_order_attribution_signals`. This helps decide whether the issue is capture, normalization, or final source selection.

    ```sql theme={null}
    -- Assumptions: timeframe=last_90_days | metric=orders+net_revenue | grain=evidence_type+evidence_source | scope=valid_orders_only_missing_utms
    WITH missing_utm_orders AS (
      SELECT
        sm_order_key,
        sm_store_id,
        order_net_revenue
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
        AND COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') = '(none) / (none)'
    ),
    signal_groups AS (
      SELECT
        sm_order_key,
        sm_store_id,
        evidence_type,
        evidence_source,
        LOGICAL_OR(attribution_signal_type = 'zero_party') AS has_zero_party_signal,
        LOGICAL_OR(attribution_signal_type = 'discount_code') AS has_discount_code_signal,
        LOGICAL_OR(raw_referrer IS NOT NULL OR raw_landing_page_url IS NOT NULL) AS has_traffic_context
      FROM `your_project.sm_transformed_v2.fct_order_attribution_signals`
      WHERE evidence_type IN ('traffic_source_candidate', 'zero_party_candidate', 'order_discount_code')
      GROUP BY 1, 2, 3, 4
    )
    SELECT
      s.evidence_type,
      s.evidence_source,
      COUNT(DISTINCT m.sm_order_key) AS orders,
      SUM(m.order_net_revenue) AS order_net_revenue,
      COUNTIF(s.has_zero_party_signal) AS orders_with_zero_party_signal,
      COUNTIF(s.has_discount_code_signal) AS orders_with_discount_code_signal,
      COUNTIF(s.has_traffic_context) AS orders_with_traffic_context
    FROM missing_utm_orders AS m
    INNER JOIN signal_groups AS s
      ON m.sm_order_key = s.sm_order_key
      AND m.sm_store_id = s.sm_store_id
    GROUP BY 1, 2
    HAVING orders >= 10
    ORDER BY order_net_revenue DESC, orders DESC;
    ```
  </Accordion>

  <Accordion title="Trace attribution evidence for one order">
    **What you'll learn:** Every attribution evidence row SourceMedium considered for one order, including raw captured values, canonicalized UTM fields, inference method, and the primary traffic winner. Use this when a stakeholder asks why one order landed in a specific channel.

    ```sql theme={null}
    -- Assumptions: grain=single_order_evidence | scope=one_order
    SELECT
      order_id,
      evidence_type,
      evidence_source,
      evidence_row_id,
      sm_utm_final_source_priority,
      sm_attribution_inference_method,
      raw_utm_source,
      raw_utm_medium,
      raw_referrer,
      raw_landing_page_url,
      sm_utm_source,
      sm_utm_medium,
      sm_utm_source_medium,
      sm_marketing_channel,
      attribution_signal_type,
      attribution_signal_parsed
    FROM `your_project.sm_transformed_v2.fct_order_attribution_signals`
    WHERE sm_store_id = 'your-sm_store_id'
      AND order_id = 'ORDER_ID_HERE'
    ORDER BY
      evidence_type,
      CASE WHEN sm_utm_final_source_priority IS NULL THEN 1 ELSE 0 END,
      sm_utm_final_source_priority,
      evidence_source,
      evidence_row_id;
    ```
  </Accordion>
</AccordionGroup>

<Info>
  These are deeper-dive investigations for when attribution looks “weird” (too much direct/unattributed), or when downstream metrics are being skewed by edge-case orders.
</Info>

<AccordionGroup>
  <Accordion title="$0 / negative net-revenue order share by source/medium (last 90 days)">
    **What you'll learn:** Which source/mediums have an unusually high share of valid orders with `order_net_revenue = 0` (or negative). This often indicates replacements/comp orders or heavy discounts that can skew repeat/retention metrics.

    ```sql theme={null}
    -- Assumptions: timeframe=last_90_days | metric=paid_vs_zero_vs_negative_order_share | grain=source_medium | scope=valid_orders_only
    WITH base AS (
      SELECT
        COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium,
        sm_order_key,
        order_net_revenue
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
    )
    SELECT
      source_medium,
      COUNT(DISTINCT sm_order_key) AS orders,
      SUM(order_net_revenue) AS order_net_revenue,
      COUNTIF(order_net_revenue > 0) AS paid_orders,
      SAFE_DIVIDE(COUNTIF(order_net_revenue > 0), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_paid_orders,
      COUNTIF(order_net_revenue = 0) AS zero_net_revenue_orders,
      SAFE_DIVIDE(COUNTIF(order_net_revenue = 0), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_zero_net_revenue_orders,
      COUNTIF(order_net_revenue < 0) AS negative_net_revenue_orders,
      SAFE_DIVIDE(COUNTIF(order_net_revenue < 0), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_negative_net_revenue_orders
    FROM base
    GROUP BY 1
    HAVING orders >= 200
    ORDER BY pct_zero_net_revenue_orders DESC, orders DESC
    LIMIT 50;
    ```
  </Accordion>

  <Accordion title="Unattributed share by source system and sales channel (last 90 days)">
    **What you'll learn:** Where unattributed orders are coming from by commerce platform (`source_system`) and sales channel (`sm_channel`). Some channels (e.g., marketplaces or POS) naturally have lower UTM coverage—this helps separate “expected” vs “broken tracking.”

    ```sql theme={null}
    -- Assumptions: timeframe=last_90_days | metric=unattributed_share | grain=source_system+sm_channel | scope=valid_orders_only
    WITH base AS (
      SELECT
        COALESCE(NULLIF(LOWER(TRIM(source_system)), ''), '(unknown)') AS source_system,
        COALESCE(NULLIF(LOWER(TRIM(sm_channel)), ''), '(unknown)') AS sm_channel,
        COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium,
        sm_order_key,
        order_net_revenue
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
        AND NULLIF(LOWER(TRIM(source_system)), '') IS NOT NULL
        AND NULLIF(LOWER(TRIM(sm_channel)), '') IS NOT NULL
    )
    SELECT
      source_system,
      sm_channel,
      COUNT(DISTINCT sm_order_key) AS orders,
      SUM(order_net_revenue) AS order_net_revenue,
      COUNTIF(source_medium = '(none) / (none)') AS unattributed_orders,
      SAFE_DIVIDE(COUNTIF(source_medium = '(none) / (none)'), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_orders_unattributed,
      SAFE_DIVIDE(
        SUM(CASE WHEN source_medium = '(none) / (none)' THEN order_net_revenue ELSE 0 END),
        NULLIF(SUM(order_net_revenue), 0)
      ) AS pct_revenue_unattributed
    FROM base
    GROUP BY 1, 2
    HAVING orders >= 200
    ORDER BY pct_revenue_unattributed DESC, orders DESC
    LIMIT 100;
    ```
  </Accordion>

  <Accordion title="Top landing pages for direct traffic orders (last 90 days)">
    **What you'll learn:** Which landing pages are most associated with “direct” orders (based on `sm_utm_source_medium`)—and whether landing page capture is missing. This helps diagnose tracking gaps (e.g., missing UTMs or missing landing-page capture on key entry flows).

    ```sql theme={null}
    -- Assumptions: timeframe=last_90_days | metric=orders+net_revenue | grain=landing_host+landing_path | scope=valid_orders_only_direct
    WITH base AS (
      SELECT
        sm_order_key,
        order_net_revenue,
        NULLIF(TRIM(sm_order_landing_page), '') AS sm_order_landing_page
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
        AND COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') IN ('(direct) / (none)', 'direct / (none)')
    ),
    parsed AS (
      SELECT
        sm_order_key,
        order_net_revenue,
        REGEXP_EXTRACT(sm_order_landing_page, r'^(?:https?://)?([^/?#]+)') AS landing_host,
        REGEXP_EXTRACT(sm_order_landing_page, r'^(?:https?://)?[^/?#]+(/[^?#]*)') AS landing_path
      FROM base
    )
    SELECT
      COALESCE(NULLIF(LOWER(TRIM(landing_host)), ''), '(missing_landing_page)') AS landing_host,
      COALESCE(NULLIF(landing_path, ''), '(missing_landing_page)') AS landing_path,
      COUNT(DISTINCT sm_order_key) AS orders,
      SUM(order_net_revenue) AS order_net_revenue
    FROM parsed
    GROUP BY 1, 2
    HAVING orders >= 25
    ORDER BY order_net_revenue DESC
    LIMIT 50;
    ```
  </Accordion>

  <Accordion title="Attribution health trend with week-over-week deltas (weekly)">
    **What you'll learn:** A “tracking regression detector”: week-over-week changes in unattributed/direct order share and revenue share. Sudden jumps typically indicate tagging/measurement changes.

    ```sql theme={null}
    -- Assumptions: timeframe=last_26_weeks | metric=utm_coverage+direct_share+unattributed_share+wow_deltas | grain=week | scope=valid_orders_only
    WITH base AS (
      SELECT
        DATE_TRUNC(DATE(order_processed_at_local_datetime), WEEK(MONDAY)) AS week_start,
        COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium,
        sm_order_key,
        order_net_revenue
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 182 DAY)
    ),
    weekly AS (
      SELECT
        week_start,
        COUNT(DISTINCT sm_order_key) AS orders,
        SUM(order_net_revenue) AS order_net_revenue,
        COUNTIF(source_medium = '(none) / (none)') AS unattributed_orders,
        SUM(CASE WHEN source_medium = '(none) / (none)' THEN order_net_revenue ELSE 0 END) AS unattributed_revenue,
        COUNTIF(source_medium IN ('(direct) / (none)', 'direct / (none)')) AS direct_orders,
        SUM(CASE WHEN source_medium IN ('(direct) / (none)', 'direct / (none)') THEN order_net_revenue ELSE 0 END) AS direct_revenue,
        COUNTIF(source_medium != '(none) / (none)') AS orders_with_utm_source_medium
      FROM base
      GROUP BY 1
    ),
    metrics AS (
      SELECT
        week_start,
        orders,
        order_net_revenue,
        SAFE_DIVIDE(orders_with_utm_source_medium, NULLIF(orders, 0)) AS pct_orders_with_utm_source_medium,
        SAFE_DIVIDE(unattributed_orders, NULLIF(orders, 0)) AS pct_orders_unattributed,
        SAFE_DIVIDE(unattributed_revenue, NULLIF(order_net_revenue, 0)) AS pct_revenue_unattributed,
        SAFE_DIVIDE(direct_orders, NULLIF(orders, 0)) AS pct_orders_direct,
        SAFE_DIVIDE(direct_revenue, NULLIF(order_net_revenue, 0)) AS pct_revenue_direct
      FROM weekly
    )
    SELECT
      week_start,
      orders,
      order_net_revenue,
      pct_orders_with_utm_source_medium,
      pct_orders_unattributed,
      pct_orders_unattributed - LAG(pct_orders_unattributed) OVER (ORDER BY week_start) AS delta_pct_orders_unattributed,
      pct_revenue_unattributed,
      pct_revenue_unattributed - LAG(pct_revenue_unattributed) OVER (ORDER BY week_start) AS delta_pct_revenue_unattributed,
      pct_orders_direct,
      pct_orders_direct - LAG(pct_orders_direct) OVER (ORDER BY week_start) AS delta_pct_orders_direct,
      pct_revenue_direct,
      pct_revenue_direct - LAG(pct_revenue_direct) OVER (ORDER BY week_start) AS delta_pct_revenue_direct
    FROM metrics
    ORDER BY week_start;
    ```
  </Accordion>

  <Accordion title="UTM source/medium discovery (top normalized values, last 90 days)">
    **What you'll learn:** What your UTM `source/medium` values actually look like in practice (normalized with `LOWER(TRIM())`). Use this to discover the exact strings you should filter on—without guessing.

    ```sql theme={null}
    -- Assumptions: timeframe=last_90_days | metric=orders+net_revenue | grain=source_medium | scope=valid_orders_only
    WITH base AS (
      SELECT
        COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium,
        sm_order_key,
        order_net_revenue
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
    )
    SELECT
      source_medium,
      COUNT(DISTINCT sm_order_key) AS orders,
      SUM(order_net_revenue) AS order_net_revenue,
      SAFE_DIVIDE(COUNT(DISTINCT sm_order_key), NULLIF(SUM(COUNT(DISTINCT sm_order_key)) OVER (), 0)) AS pct_orders,
      SAFE_DIVIDE(SUM(order_net_revenue), NULLIF(SUM(SUM(order_net_revenue)) OVER (), 0)) AS pct_revenue
    FROM base
    GROUP BY 1
    ORDER BY orders DESC
    LIMIT 50;
    ```
  </Accordion>

  <Accordion title="Join-key coverage trend (weekly missing customer keys + missing SKUs)">
    **What you'll learn:** Whether key join fields are getting worse over time. Spikes in missing `sm_customer_key` (orders) or missing `sku` (order lines) will break customer-level and product-level analysis.

    ```sql theme={null}
    -- Assumptions: timeframe=last_26_weeks | metric=missing_key_trends | grain=week | scope=valid_orders_only
    WITH orders_weekly AS (
      SELECT
        DATE_TRUNC(DATE(order_processed_at_local_datetime), WEEK(MONDAY)) AS week_start,
        COUNT(*) AS orders_total,
        COUNTIF(sm_customer_key IS NULL) AS orders_missing_customer_key
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 182 DAY)
      GROUP BY 1
    ),
    lines_weekly AS (
      SELECT
        DATE_TRUNC(DATE(order_processed_at_local_datetime), WEEK(MONDAY)) AS week_start,
        COUNT(*) AS lines_total,
        COUNTIF(sku IS NULL OR TRIM(sku) = '' OR LOWER(sku) = 'missing sku') AS lines_missing_sku
      FROM `your_project.sm_transformed_v2.obt_order_lines`
      WHERE is_order_sm_valid = TRUE
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 182 DAY)
      GROUP BY 1
    )
    SELECT
      o.week_start,
      o.orders_total,
      o.orders_missing_customer_key,
      SAFE_DIVIDE(o.orders_missing_customer_key, NULLIF(o.orders_total, 0)) AS pct_orders_missing_customer_key,
      l.lines_total,
      l.lines_missing_sku,
      SAFE_DIVIDE(l.lines_missing_sku, NULLIF(l.lines_total, 0)) AS pct_lines_missing_sku
    FROM orders_weekly o
    LEFT JOIN lines_weekly l
      USING (week_start)
    ORDER BY o.week_start;
    ```
  </Accordion>

  <Accordion title="Multiple discount codes prevalence (double-counting risk, last 90 days)">
    **What you'll learn:** How often orders have multiple discount codes applied. This matters because any “revenue by discount code” view will double-count revenue across codes when multiple codes exist.

    ```sql theme={null}
    -- Assumptions: timeframe=last_90_days | metric=discount_code_multiplicity | grain=code_count_bucket | scope=valid_orders_only
    WITH base AS (
      SELECT
        sm_order_key,
        order_net_revenue,
        order_discount_codes_csv
      FROM `your_project.sm_transformed_v2.obt_orders`
      WHERE is_order_sm_valid = TRUE
        AND order_cancelled_at IS NULL
        AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
        AND order_discount_codes_csv IS NOT NULL
        AND TRIM(order_discount_codes_csv) NOT IN ('', '(none)')
    ),
    code_counts AS (
      SELECT
        sm_order_key,
        order_net_revenue,
        ARRAY_LENGTH(
          ARRAY(
            SELECT TRIM(code_raw)
            FROM UNNEST(SPLIT(order_discount_codes_csv, ',')) AS code_raw
            WHERE TRIM(code_raw) != ''
          )
        ) AS code_count
      FROM base
    ),
    bucketed AS (
      SELECT
        sm_order_key,
        order_net_revenue,
        LEAST(code_count, 5) AS code_count_bucket
      FROM code_counts
    )
    SELECT
      CASE code_count_bucket
        WHEN 5 THEN '5+'
        ELSE CAST(code_count_bucket AS STRING)
      END AS code_count_bucket,
      COUNT(DISTINCT sm_order_key) AS orders,
      SAFE_DIVIDE(COUNT(DISTINCT sm_order_key), NULLIF(SUM(COUNT(DISTINCT sm_order_key)) OVER (), 0)) AS pct_orders,
      SUM(order_net_revenue) AS order_net_revenue,
      SAFE_DIVIDE(SUM(order_net_revenue), NULLIF(SUM(SUM(order_net_revenue)) OVER (), 0)) AS pct_revenue
    FROM bucketed
    GROUP BY 1, code_count_bucket
    ORDER BY code_count_bucket;
    ```
  </Accordion>
</AccordionGroup>

***

## Related Categories

Back to the [SQL Query Library](/data-activation/template-resources/sql-query-library).

<CardGroup cols={2}>
  <Card title="Marketing & Ads" icon="chart-line" href="/data-activation/template-resources/sql-query-library/marketing-and-ads">
    Use clean attribution signals to evaluate paid channel efficiency.
  </Card>

  <Card title="Journeys & Lead Capture" icon="route" href="/data-activation/template-resources/sql-query-library/journeys-and-lead-capture">
    Analyze touchpoint sequencing once data coverage is validated.
  </Card>

  <Card title="Funnel" icon="filter" href="/data-activation/template-resources/sql-query-library/funnel">
    Investigate funnel changes with confidence in tracking quality.
  </Card>
</CardGroup>
