A DTC operator I worked with wanted Q1 2025 versus Q1 2024 in their board deck. Q1 2024 lived in Universal Analytics. Q1 2025 lived in GA4. The two systems counted different things in different shapes, and the deck slipped a week while the team tried to make the numbers match. They could not, because the comparison they wanted does not exist inside GA4.
By the end of this walkthrough you will have a year-over-year report that survives the GA4 migration, anchored on Shopify orders, with UA history and GA4 history harmonized in a BigQuery view you control. You will also have a clear list of dimensions that did not survive and a defensible answer for the CEO when they ask why bounce rate disappeared.
I built this pattern most recently at a mid-market DTC operator in Q1 2026, as part of the multi-service pipeline that fed a handful of production dashboards. The same shape works for any Shopify store with the GA4 migration year-over-year data problem to solve before the next quarterly deck.
Why GA4 broke your year-over-year report
Universal Analytics and GA4 are not the same product with a new logo. They have different data models, and the difference is structural.
UA was session-and-hit scoped. Every interaction was a hit, attached to a session, attached to a user. GA4 is event-scoped. Everything is an event with parameters, and sessions are derived after the fact via session_start and ga_session_id. A SQL query that worked on the UA BigQuery export breaks on the GA4 export because the table shape changed.
UA stopped processing new data on 2024-07-01. Any year-over-year question crossing that date is asking about two different counting systems, not two periods of one system. Consent mode v2 widens the gap further: GA4 observes a smaller share of conversions every reporting cycle as more browsers default to consent gates. A flat YoY number in the GA4 UI can hide actual growth that consent gating ate.
There is no good ending inside the GA4 UI. You cannot import UA history into GA4. The GA4 property starts when you created it, which for most operators was sometime in 2023 in a panic. The comparison you want has to happen outside both products.
Prerequisites for the warehouse bridge
Before any SQL runs, four things need to be in place.
A warehouse. BigQuery is the easiest answer because GA4's native BigQuery export is free for the standard property and the on-demand pricing keeps quarterly costs under $100 for a typical mid-market DTC store. Snowflake or Postgres work too with minor dialect changes. I cover the BigQuery setup that works on a sub-$1,000 monthly budget if you do not have a warehouse yet.
A UA archive. This is the time-pressured prerequisite. If you still have a 360 BigQuery export, that is the gold standard. If you only had a standard UA property, the data is locked behind whatever export you took before the cutoff. If you took no export at all, you have CSV downloads and Looker Studio reports, which constrains how granular YoY can be.
Shopify orders already in the warehouse. The whole pattern anchors on Shopify as the financial source of truth. The warehouse first reporting pattern that anchors on Shopify orders is the parent piece of this tutorial.
A decision about which metrics survive the boundary. Sessions, users, conversion rate, and source/medium roll-ups can be approximated. Hit-level engagement metrics, demographics, and bounce rate cannot. Make the call now so you do not spend a sprint trying to bridge a metric you are going to abandon anyway.
Step 1: Pull the UA archive while you still can
Even now, in mid-2026, I run into operators who never took a UA export. If you have not pulled it yet, do that this week.
The 360 BigQuery export is the gold standard. If you had a 360 property, you have a daily table with hit-level granularity. Confirm the export still exists in your linked GCP project and copy it to a permanent dataset in your own project so it cannot be revoked.
CREATE TABLE `your_project.ua_archive.sessions_2023_2024` AS
SELECT *
FROM `original_project.123456789.ga_sessions_*`
WHERE _TABLE_SUFFIX BETWEEN '20230101' AND '20240630';
If you only have a standard property and pre-cutoff CSV exports or Supermetrics-style scrapes, land those into a ua_archive.daily_summary table with date, source, medium, sessions, transactions, and revenue. Session granularity is gone; daily roll-ups are what you have. GA4 will need to roll up to that grain to match.
Whichever source you have, normalize the column names early. The UA archive table should have a stable schema you control, not whatever the original export shape was.

CREATE OR REPLACE VIEW `your_project.ua_archive.daily_normalized` AS
SELECT
DATE(visitStartTime, 'America/Los_Angeles') AS day,
trafficSource.source AS source,
trafficSource.medium AS medium,
channelGrouping AS channel_grouping,
COUNT(DISTINCT CONCAT(fullVisitorId, CAST(visitId AS STRING))) AS sessions,
COUNT(DISTINCT IF(totals.transactions > 0, CONCAT(fullVisitorId, CAST(visitId AS STRING)), NULL)) AS converting_sessions,
SUM(totals.totalTransactionRevenue) / 1000000 AS revenue
FROM `your_project.ua_archive.sessions_2023_2024`
GROUP BY day, source, medium, channel_grouping;
Anchor on dates in your reporting timezone, not UTC. UA stored everything in the property's reporting timezone, and if you forget to declare it explicitly the YoY query will silently shift by hours.
Step 2: Land GA4 BigQuery export and harmonize
GA4's BigQuery export is free for the standard property as of 2026, with a 1 million event per day cap. Most DTC stores under $10M annual revenue stay under that.
The export lands in analytics_<property_id>.events_YYYYMMDD partitioned tables. Each row is one event, and event parameters are nested in the event_params array. The first thing every GA4 query does is unnest those params into flat columns.
CREATE OR REPLACE VIEW `your_project.ga4.daily_normalized` AS
SELECT
PARSE_DATE('%Y%m%d', event_date) AS day,
COALESCE(
(SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'source'),
traffic_source.source
) AS source,
COALESCE(
(SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'medium'),
traffic_source.medium
) AS medium,
COUNT(DISTINCT CONCAT(
user_pseudo_id,
CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)
)) AS sessions,
COUNTIF(event_name = 'purchase') AS purchases,
SUM(IF(event_name = 'purchase',
(SELECT value.double_value FROM UNNEST(event_params) WHERE key = 'value'),
0
)) AS revenue
FROM `your_project.analytics_123456789.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20240701' AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
GROUP BY day, source, medium;
Three things to notice. GA4 sessions are derived from ga_session_id per user_pseudo_id. The COUNT(DISTINCT CONCAT(...)) pattern reconstructs the session count, but it will not match GA4's UI session count exactly because the UI applies engagement rules the raw export does not. Document the difference, do not chase it.
Source and medium fall back to traffic_source.* for first_open events and edge cases. The event_params lookup gets the click-time source for the event itself, which is what UA was reporting. Use both with a COALESCE, in that order.
event_date is in the property's reporting timezone, but event_timestamp is UTC microseconds. Mixing them without converting will drift your YoY day boundaries by up to a day at year ends. Use event_date for daily roll-ups.
Step 3: Build the bridge view in SQL
Now both sides exist as daily_normalized views with the same column names. The bridge is a UNION ALL with a system column that tags each row, plus a few harmonization touches.
CREATE OR REPLACE VIEW `your_project.warehouse.traffic_unified` AS
SELECT
day,
'UA' AS system,
source,
medium,
-- Bring UA channel grouping into a GA4-style label so the report rolls up cleanly
CASE channel_grouping
WHEN 'Paid Search' THEN 'Paid Search'
WHEN 'Display' THEN 'Display'
WHEN 'Organic Search' THEN 'Organic Search'
WHEN 'Direct' THEN 'Direct'
WHEN 'Social' THEN 'Organic Social'
WHEN 'Paid Social' THEN 'Paid Social'
WHEN 'Email' THEN 'Email'
WHEN 'Referral' THEN 'Referral'
ELSE 'Unassigned'
END AS channel_grouping_unified,
sessions,
revenue
FROM `your_project.ua_archive.daily_normalized`
UNION ALL
SELECT
day,
'GA4' AS system,
source,
medium,
-- GA4 channel grouping rules applied at query time; this is the canonical mapping
CASE
WHEN medium = 'cpc' AND source LIKE '%google%' THEN 'Paid Search'
WHEN medium = 'cpc' THEN 'Paid Search'
WHEN medium IN ('display', 'banner', 'expandable') THEN 'Display'
WHEN medium = 'organic' THEN 'Organic Search'
WHEN medium = '(none)' OR source = '(direct)' THEN 'Direct'
WHEN source IN ('facebook', 'instagram', 'tiktok', 'twitter') AND medium NOT IN ('cpc', 'paid') THEN 'Organic Social'
WHEN source IN ('facebook', 'instagram', 'tiktok') AND medium IN ('cpc', 'paid') THEN 'Paid Social'
WHEN medium = 'email' THEN 'Email'
WHEN medium = 'referral' THEN 'Referral'
ELSE 'Unassigned'
END AS channel_grouping_unified,
sessions,
revenue
FROM `your_project.ga4.daily_normalized`;
That view is the bridge. It declares which system each row came from so any consumer can opt into one side or both, and the system column makes it explicit when displaying YoY that a 2024 UA number is structurally different from a 2025 GA4 number. It harmonizes channel grouping into one label across both sides; UA called organic and paid social the same, GA4 splits them, and the CASE statements reconcile that. It does not pretend the session counts mean the same thing, because they do not. The bridge exposes both honestly. The next step is what makes the YoY actually trustworthy.
Step 4: Anchor YoY on Shopify, not on either GA4 property
This is the move that makes the report defensible. Stop asking GA4 to be the source of truth for revenue. Ask it to be a directional shape for traffic and let Shopify carry the financial number.
Shopify orders are date-comparable across the UA-to-GA4 boundary because Shopify did not change its data model in mid-2024. An order from June 2024 has the same shape as an order from May 2025. Joining traffic shape onto Shopify revenue gives you a comparison where the revenue side is rock-solid and the traffic side is transparent about its limits. The full version of this pattern lives in the warehouse first reporting layer that this anchors on.
CREATE OR REPLACE VIEW `your_project.warehouse.yoy_report` AS
WITH shopify_daily AS (
SELECT
DATE(processed_at, 'America/Los_Angeles') AS day,
COUNT(DISTINCT order_id) AS orders,
SUM(total_price) AS revenue,
SUM(IF(financial_status = 'refunded', total_price, 0)) AS refunded_revenue
FROM `your_project.shopify.orders`
WHERE financial_status IN ('paid', 'partially_refunded', 'refunded')
GROUP BY day
),
traffic_daily AS (
SELECT
day,
system,
SUM(sessions) AS sessions
FROM `your_project.warehouse.traffic_unified`
GROUP BY day, system
)
SELECT
s.day,
EXTRACT(YEAR FROM s.day) AS year,
EXTRACT(QUARTER FROM s.day) AS quarter,
s.orders,
s.revenue - s.refunded_revenue AS net_revenue,
t.sessions AS sessions_reported,
t.system AS traffic_system
FROM shopify_daily s
LEFT JOIN traffic_daily t USING (day);
Shopify supplies the revenue side cleanly. The unified traffic view supplies a sessions number with an explicit traffic_system flag so the consumer knows whether it came from UA or GA4. A board chart can show net revenue YoY with full confidence and traffic YoY with a caveat that the counting system changed.
Conversion rate (orders divided by sessions) is the most fragile metric across the boundary, because the denominator changed shape. If you must report it, present it as two series rather than one continuous line, and label the break at the system change. If the YoY conversation is really about retention rather than acquisition, the BigQuery joins for cohort retention against Shopify orders survive the migration cleanly because they live entirely on the Shopify side.

“Shopify orders are date-comparable across the UA-to-GA4 boundary because Shopify did not change its data model in mid-2024.
”
What to abandon entirely
This is the part most operators resist, and it is the part that separates a real fix from a year of fighting the wrong battle.
Bounce rate is gone. GA4 reports engagement rate, which is built on a different definition (engaged sessions over total sessions, where engaged means more than 10 seconds, multiple pageviews, or a conversion). The two are not interconvertible. Stop showing bounce rate YoY. Redirect the team to engagement rate as a fresh baseline starting at the GA4 launch date.
Hit-level engagement metrics that depended on UA's session structure (avg session duration, pages per session in the old shape) do not translate cleanly. GA4 has its own engaged-session metrics, but they count differently. Pick one, declare the start date, and stop trying to bridge.
Demographics drift hard. UA's demographic data was based on Google's ad-network signals, which changed substantially in 2024 with the privacy and consent shifts. A YoY demographic comparison is comparing different datasets, not different periods.
Goal completions versus conversions. UA goals were a configured construct; GA4 key events are events flagged as important. If your team had a "Newsletter Signup" goal in UA and a sign_up event in GA4, audit the definitions before any YoY claim about conversions.
User counts. UA users were based on clientId; GA4 users are based on user_pseudo_id. Both are cookie-based, but they count differently across cross-domain and cross-device. User-level YoY is the metric most likely to look stable when it is actually drifting. Treat it as a ratio over orders, not as an absolute count.
The pattern across all of these is the same: when the metric has a different definition in the two systems, declaring a clean start date in GA4 is more honest than synthesizing a bridge.

Common mistakes
Five mistakes show up on every project where I am cleaning up someone else's GA4 migration.
Treating GA4 sessions as comparable to UA sessions. The session-derivation rules differ, and the consent gating differs. Overlay the window where both ran simultaneously and the session counts will not match, sometimes by 15-30%. That gap is structural.
Forgetting the overlap window. Most properties created GA4 in 2023 alongside UA, then UA stopped processing on 2024-07-01. Pick which side counts in the overlap (I default to UA, because it was the operator's primary report at the time) and document the choice.
Joining traffic to revenue without timezone normalization. UA stored data in the property's reporting timezone. GA4's event_date is too, but if you mistakenly reach for event_timestamp (UTC microseconds) and convert wrong, you will see a 6-12% YoY swing that is purely a timezone artifact.
Trying to backfill GA4 with UA data. There is no supported import path. Tools that claim to do this are reconstructing from CSV exports and producing a synthetic stream, not actually populating GA4. The bridge view is the supported pattern.
Forgetting that consent mode v2 widens the gap silently. A flat YoY revenue number can hide actual growth that consent gating ate. Cross-check against Shopify orders any time the GA4 number looks suspiciously stable.

What to try next
Once the bridge view is in place and the deck has a defensible YoY chart, extend the warehouse to cover the rest of the operator question set.
Run the same warehouse-first reconciliation against Meta and the rest of the ad platforms, since the next deck question after "did revenue grow?" is always "where did the growth come from?" and that lives in cross-platform reconciliation.
If you would rather have someone audit your current setup before doing any of this, the DTC Stack Audit covers the GA4 migration year-over-year data state alongside the rest of the analytics and tracking stack.
FAQ
Can I import my UA data into GA4 directly?
No. Google has not provided a supported path to import Universal Analytics history into a GA4 property. Any tool claiming to do this is reconstructing from CSV exports and producing a synthetic stream, not actually populating the GA4 property. The supported pattern is the warehouse bridge, where UA and GA4 data sit side by side in BigQuery and a unified view layers over both.
What if I never took a UA export and the data is gone?
You still have whatever Looker Studio reports were saved and whatever CSV downloads team members took at the time. Land those into a daily summary table with date, source, medium, sessions, transactions, and revenue. You will lose session-level granularity, but you can still build a YoY view at the daily-channel level. For anything more granular than that, the data is genuinely gone and the honest move is to declare a fresh baseline starting at the GA4 launch date.
How do I handle the overlap window where UA and GA4 were both running?
Most properties have a window in 2023 to mid-2024 where both UA and GA4 were active. Pick which side counts in the overlap and document the choice. I default to UA for that period because it was the operator's primary system at the time, and switch to GA4 starting 2024-07-02 (the day after UA stopped processing). Hardcode that boundary into the bridge view's CASE logic so consumers cannot accidentally double-count.
Why not just use GA4's UI year-over-year comparison feature?
GA4's UI lets you compare two date ranges, but it can only compare two ranges that both have GA4 data. If your 2024 first half lived in UA, GA4's UI cannot reach it. The comparison view in GA4 will show empty data for the UA-only period, which makes the report look broken rather than informative. The warehouse bridge gives you a single chart that spans both eras with the system flag declared.
How long does this whole rebuild take in practice?
For a single Shopify store with the warehouse already in place, the bridge views take 1-2 days of SQL work plus a half-day of validation against the few sanity-check periods where both UA and GA4 ran simultaneously. If the warehouse itself does not exist yet, add a sprint for the BigQuery setup, the GA4 export linkage, and the Shopify orders ingestion. Most stores I work with go from "the deck is broken" to "the deck has defensible YoY" in 2-3 weeks.
What about Looker Studio templates that promise to do this for me?
Looker Studio templates can connect to UA archives and GA4 in the same report, but they cannot harmonize the schemas. You end up with two separate charts side by side with different definitions of session and conversion, which is exactly the problem you started with. A template that does not write SQL is a template that does not solve the structural difference. The bridge view in BigQuery is what makes the Looker Studio chart on top of it actually work.
Sources and specifics
- Universal Analytics stopped processing new data on 2024-07-01 for standard properties; 360 BigQuery export windows extended for paying customers but the underlying UA data is frozen at the cutoff.
- GA4 BigQuery export is event-scoped (one row per event with parameters as nested arrays), while UA BigQuery export was session-scoped (one row per session with hits as nested arrays). The schemas are not compatible without a harmonization layer.
- The bridge views in this walkthrough were built on a mid-market DTC operator's BigQuery warehouse in Q1 2026 as part of a multi-service analytics pipeline serving a handful of production dashboards.
- Consent mode v2 widens the GA4-versus-actual gap quarter over quarter as more browsers default to consent gates, which means the YoY drift compounds. Cross-checking against Shopify orders is the corrective.
- BigQuery on-demand pricing as of April 2026 is $6.25 per TB scanned. A typical mid-market DTC store using these views runs under $100 per quarter on warehouse cost when the tables are partitioned by
event_dateand clustered onsourceandmedium.
