Skip to content
← ALL WRITING

2026-04-23 / 9 MIN READ

Reconciling Shopify, GA4, and Meta: the forensic workflow

Field notes on the forensic workflow for reconciling Shopify, GA4, and Meta. Diff windows, join keys, the five bugs that hide in plain sight.

The most common question I get at a new DTC engagement is some version of "why do our numbers not match." Shopify says one thing, GA4 says another, Meta says a third. Nobody fully trusts any of them. The team has been defaulting to whichever number makes the quarterly look best, which is a scary way to run a business.

These are field notes on the forensic workflow I actually run. It is not a theoretical framework, it is the order of operations I have walked through at least a dozen times. At the end of it, you have a defensible picture of how much each platform is off by and which of the platforms is drifting.

For the conceptual side of why the numbers disagree, the sibling article reconciling GA4, Meta, and Shopify purchase counts covers the five sources of variance. This piece is the hands-on workflow. Fits into the warehouse-first analytics rebuild hub as the diagnostic step before any rebuild.

ga4 vs shopify · gap % · 90 days
healthy banddaily gap
0%10%20%30%40%drift: day 58d0d30d60d89

[drift] Gap steps from ~9% to ~24% on day 58. Operator knows exactly when the regression shipped; team traces it to a tag manager update.

The drift date is the artifact. Absolute numbers are less useful than the timeline.

Step 1: pick the window

Before pulling any data, pick a window that is free of known anomalies. Not the week you ran a flash sale. Not the week the Meta API had an outage. Not the week you launched a new product. A boring week two to three weeks back is ideal.

30 days is the default window I use. Long enough to smooth out single-day weirdness, short enough that attribution windows are fully closed. Shopify orders complete within minutes; GA4 event processing closes within 24 hours; Meta attribution closes at 7 days for the default window, 28 days for longer.

Pull data with an explicit start and end timestamp in UTC. Platform default timezones are a surprisingly common cause of reconciliation drift; GA4 and Meta both show data in the property's timezone by default, which may or may not match Shopify's.

Step 2: pull Shopify, first

Shopify is the anchor. It is the only platform that reports what actually happened (a payment processed). Pull the orders file with explicit filters.

Export: Orders
Date range: 2026-03-24 00:00 UTC to 2026-04-23 00:00 UTC
Financial status: paid
Source name: web (exclude POS, draft, wholesale)
Fulfillment status: any

Save to a spreadsheet with columns: order_id, order_number, created_at, total_price, currency, source_name, customer_id. Add a derived column for date_utc that normalizes to the pull's timezone.

Count the rows. That is your Shopify number. 1,483 orders, $142,089 in revenue, 972 unique customers, in a typical mid-market window.

Step 3: pull GA4 purchase events

Pull GA4 purchase events for the same window. Use the GA4 API (Google Analytics Data API), not the UI. The UI applies sampling above certain thresholds and you want the unsampled export.

// Sample API call
const request = {
  property: "properties/XXXXXXXXX",
  dimensions: [
    { name: "date" },
    { name: "sessionSource" },
    { name: "sessionMedium" },
  ],
  metrics: [
    { name: "purchases" },
    { name: "purchaseRevenue" },
    { name: "totalUsers" },
  ],
  dateRanges: [{ startDate: "2026-03-24", endDate: "2026-04-22" }],
};

Save to a spreadsheet. Sum the purchases column. GA4 count. For the same window, expect roughly 91 percent of the Shopify count. If it is materially lower (say 75 percent), something is eating GA4 events: consent mode gating, broken pixel, or a double-count that Shopify sees and GA4 does not.

Step 4: pull Meta purchases

Pull Meta purchase conversions via the Marketing API, filtered to your pixel.

Marketing API: /act_ACCOUNT_ID/insights
Fields: actions (filter to purchase), action_values (purchase), impressions, clicks, spend
Date preset: custom, matching the window
Attribution: 7-day click + 1-day view (default)
Breakdown: by campaign

Save to a spreadsheet. Sum the purchase count and value. Meta count. For the same window, expect Meta-attributed purchases to be 10 to 30 percent of Shopify total (Meta only gets credit for Meta-attributed orders, which is a subset of total orders).

Step 5: diff them side by side

Put three columns in one spreadsheet: date, Shopify count, GA4 count, Meta count (Meta-attributed only). Compute percentage variance. Sort by date.

The exact query I run in BigQuery when the warehouse exists:

SELECT
  date,
  shopify_orders,
  ga4_purchases,
  meta_purchases,
  ROUND(100.0 * (shopify_orders - ga4_purchases) / shopify_orders, 1)
    AS ga4_gap_pct,
  ROUND(100.0 * (shopify_orders - meta_purchases) / shopify_orders, 1)
    AS meta_gap_pct
FROM `project.marts.daily_reconciliation`
WHERE date BETWEEN '2026-03-24' AND '2026-04-22'
ORDER BY date;

What to look for:

  • GA4 gap 5-15 percent: healthy, expected cookie loss and consent gating.
  • GA4 gap above 20 percent: broken implementation, likely double-fire or missing tag.
  • GA4 gap below 2 percent: a dedup failure somewhere; GA4 should never equal Shopify exactly.
  • Meta gap 60-90 percent: healthy, Meta only sees Meta-attributed orders.
  • Meta gap below 60 percent: Meta is over-attributing, probably a CAPI dedup issue.
  • Meta count above Shopify: definitely a CAPI dedup issue. See the field guide to Meta CAPI for the event_id fix.

Step 6: find the drift date

The most useful step in the workflow. Plot the daily gap over time and look for the day the variance pattern changed.

If GA4 was landing at 10 percent below Shopify for six months and suddenly dropped to 25 percent below, something broke on that day. Find what changed: a theme update, a tag manager change, a new consent banner deployment, an iOS release.

The date is usually not a secret. Someone on the team remembers shipping the thing. The reconciliation output surfaces exactly when, which makes the conversation productive instead of vague.

Step 7: chase the root cause

By this point the workflow has narrowed the problem to one or two of the five variance sources. The fixes depend on which source.

GA4 gap too big: audit the tag manager. Run the parallel-tag approach from the GA4 migration playbook. Check consent mode v2 wait-for-update timing.

Meta over-attributing: the event_id setup is broken. See the CAPI field guide. Check Test Events for duplicate entries.

Shopify counts POS or subscription rebills: filter to source_name = 'web' and exclude subscription rebill orders. Tag the staging model appropriately.

Currency or refund handling drift: reconcile the revenue column before the count column. Currency conversion drift usually shows up as a consistent 1-3 percent variance that never resolves; refund handling shows up as a long tail of orders appearing in Shopify but not in the platform reports.

The reconciliation as a recurring mart model

Once the warehouse exists, this entire workflow becomes a single mart model that runs nightly.

-- models/marts/daily_reconciliation.sql
WITH shopify AS (
  SELECT
    DATE(created_at) AS date,
    COUNT(DISTINCT order_id) AS shopify_orders,
    SUM(total_price) AS shopify_revenue
  FROM {{ ref('stg_shopify__orders') }}
  WHERE financial_status = 'paid'
    AND source_name = 'web'
  GROUP BY 1
),
ga4 AS (
  SELECT
    DATE(event_date) AS date,
    COUNTIF(event_name = 'purchase') AS ga4_purchases,
    SUM(CASE WHEN event_name = 'purchase' THEN event_value END)
      AS ga4_revenue
  FROM {{ ref('stg_ga4__events') }}
  GROUP BY 1
),
meta AS (
  SELECT
    date,
    SUM(purchases) AS meta_purchases,
    SUM(purchase_value) AS meta_revenue
  FROM {{ ref('stg_meta_ads__daily') }}
  GROUP BY 1
)
SELECT
  s.date,
  s.shopify_orders,
  g.ga4_purchases,
  m.meta_purchases,
  s.shopify_revenue,
  g.ga4_revenue,
  m.meta_revenue
FROM shopify s
LEFT JOIN ga4 g USING (date)
LEFT JOIN meta m USING (date)
ORDER BY s.date DESC

Pin this mart into a Looker Studio page. Add alert rules on the variance columns. A 7-day moving gap outside the healthy band triggers a Slack notification. That is the whole ongoing workflow after the initial rebuild.

The first reconciliation is a forensic exercise; after that it is a monitoring job. Operators who do not automate the second step are doomed to repeat the first every quarter.

FAQ

How long does the first reconciliation take?

The first time, about a full day of spreadsheet work. The second time, with the warehouse in place, it is a 10-minute SQL query against a pre-built mart. The leverage is all in automating the ongoing check, not in doing the one-time diff.

What if my Meta Ads Manager and my pixel disagree with each other?

That is a different problem from reconciliation. Meta's Ads Manager shows pixel + CAPI events; the Marketing API sometimes exposes them separately. If Ads Manager and the API disagree materially, you have a reporting-side bug in how Meta is serving its own data. Contact Meta support; this is rare but happens.

Should I reconcile monthly or weekly?

Weekly during initial rebuild and anything requiring an implementation change. Monthly after the stack is stable. The variance bands move around a bit week over week; monthly averages are more stable and a week-over-week chase can become a time sink.

How do I handle multi-currency reconciliation?

Reconcile in a single base currency, converting at the order's ledger rate (the rate Shopify used when the order processed). Do not convert using a spot rate at query time; you will introduce a variance that looks like a reconciliation bug and is actually a bookkeeping artifact.

What does 'healthy' look like for a brand under $500K/yr revenue?

Samples get noisy below a few hundred orders per month. A brand doing 50 orders per month will see 10-20 percent variance from random variation alone, even on a clean stack. The workflow above works best for brands doing 500+ orders per month; below that, trust the direction more than the exact numbers.

What to try this week

Block two hours. Pull Shopify, GA4, and Meta for the same 30-day window. Put them in one spreadsheet. Compute the variance per platform. If the numbers fit the healthy bands, your stack is broadly working and you can pick which deeper dashboard to rebuild next. If the numbers are outside the bands, you have the specific diagnostic to act on.

If you do not have the time or context to run this manually, a DTC Stack Audit runs this exact workflow plus 20+ other checks and delivers a report naming the drift date and the root cause.

Sources and specifics

// related

Let us talk

If something in here connected, feel free to reach out. No pitch deck, no intake form. Just a direct conversation.

>Get in touch