Klaviyo to warehouse ETL: three ways that work in 2026

Klaviyo is the customer memory layer for most DTC brands. It holds email sends, clicks, opens, flow memberships, segment membership changes, and profile-level attributes you do not get from Shopify alone. All of that data belongs in your warehouse, not locked behind Klaviyo's dashboards.

This is a pattern library for the three ways to get Klaviyo data into BigQuery in 2026, with the trade-offs, the dbt models on top, and a reconciliation check that keeps the sync honest. Fits into the warehouse-first analytics rebuild hub at stage 2, ingest.

For the Klaviyo flow-design side of the stack, the sibling hub is the Klaviyo lifecycle playbook for DTC.

Mountain range with rocky outcrop covered in yellow-green moss against a dramatic dusk sky — // dusk over the ridge

Path 1: Klaviyo's native BigQuery connector

Klaviyo released a native BigQuery connector in 2025. It is the cheapest option (free, included in any Klaviyo paid plan) and the thinnest. Worth starting here if your volume is modest and your Klaviyo account is not old-and-crusty.

Setup is three steps. In Klaviyo → Integrations → Data → BigQuery, authenticate to a GCP service account with BigQuery Data Editor role. Pick a dataset. Pick a sync cadence (6 hours is the default; 1 hour is the fastest).

What it syncs: events, metrics, profiles, lists, segments, campaigns, flows. Roughly the top-level entities. It does not sync: flow email content, A/B test results, deliverability reports, or per-message engagement at granular levels. For those you fall through to Path 3.

What it looks like in BigQuery:

klaviyo_raw/
  events              -- every metric occurrence
  metrics             -- metric definitions (opened_email, clicked_email, etc.)
  profiles            -- subscriber-level attributes
  lists               -- list definitions
  segments            -- segment definitions (not memberships)
  campaigns           -- campaign metadata
  flows               -- flow definitions (not membership events)

The schemas are JSON-heavy. Events come with a properties blob that needs flattening per metric. A placed_order event has OrderId, Total, Items inside properties; a clicked_email event has CampaignId, Url, ClickId. The dbt staging models have to know which metric a row is to extract the right fields.

Cost: $0 for the connector. BigQuery storage for a typical mid-market DTC Klaviyo account runs 2 to 5 GB, under $1 per month. Query costs depend on your usage.

Latency: 1 to 6 hours depending on sync cadence. Fine for daily analytics; not fine for real-time.

Close-up macro of yellow-green moss colonies on a weathered rock surface — // moss colonies up close

Path 2: Fivetran or Airbyte

Fivetran has the most mature Klaviyo connector on the market and Airbyte has the most flexible open-source one. Both handle the same scope as Klaviyo's native connector plus longer backfill and better reliability at scale.

Fivetran's Klaviyo connector pricing as of April 2026 is based on monthly active rows (MAR); for a mid-market DTC brand with 150K profiles and 2M events per month, expect $300 to $700 per month just for Klaviyo. That adds up when you also need Shopify, Meta, and Google Ads.

Airbyte Cloud is roughly half that price for the same volume. Airbyte OSS self-hosted is free at the subscription level but you pay in the engineering time to run a Kubernetes or Docker deployment and debug the occasional stuck sync.

Both deliver the same Klaviyo data to BigQuery with slightly different schema shapes. Fivetran's schema is more normalized; Airbyte's preserves more of the raw API response. The dbt staging models have to be written per connector.

When to pick which:

Native Klaviyo connector: starter, up to ~100K profiles, okay with 6-hour latency, okay with 90-day history.
Fivetran: mid-market, priority on reliability, have the budget, want to move fast.
Airbyte Cloud: mid-market, want flexibility without running infrastructure.
Airbyte OSS: have a data engineer, care about cost, can debug sync issues.
Path 3 (DIY): specific requirements the connectors do not handle, or you are the data engineer and writing an ingestion service is cheaper than paying for a connector.

A single rocky outcrop isolated against a dusk sky — // one outcrop alone

Path 3: DIY API ingestion

Klaviyo has a well-documented REST API at developers.klaviyo.com. Writing your own ingestion service is the most work upfront and the cheapest at steady state. This is the path I have shipped twice in production, most recently on a Q1 2026 analytics engine rebuild.

The pattern:

Klaviyo API → ingestion service → BigQuery stream inserts
                                → Shopify reconciliation
                                → GA4 Measurement Protocol forwarder

Every event from Klaviyo goes through the same pipeline. The ingestion service validates the shape, writes to BigQuery via the Storage Write API, and forwards a subset to GA4 for the optimization signal.

For Klaviyo specifically, the three API endpoints that matter:

GET /api/events                  -- incremental, cursor-paginated
GET /api/profiles               -- for profile metadata changes
GET /api/metrics                -- metric definitions, sync once daily

Events are the meat. Pull with cursor pagination, filter by filter=greater-than(datetime,"2026-04-23T00:00:00Z"), page through until the cursor runs out, insert into BigQuery. Run every 5 to 15 minutes depending on how fresh you need it.

Klaviyo rate limits are the gotcha. The API is limited to 75 requests per second sustained, 150 burst. For a high-volume DTC brand pulling full event history, you will hit rate limits during backfill. Build in exponential backoff from the start.

Ultra-wide mountain ridge receding into a vanishing point at dusk — // ridge to vanishing point

The dbt staging model for Klaviyo events

Regardless of ingestion path, the staging model pattern is similar. Flatten the properties blob per metric.

-- models/staging/klaviyo/stg_klaviyo__events.sql
{{
  config(
    materialized='incremental',
    unique_key='event_id',
    partition_by={'field': 'occurred_at', 'data_type': 'timestamp'},
    cluster_by=['profile_id', 'metric_name']
  )
}}

WITH base AS (
  SELECT
    CAST(id AS STRING) AS event_id,
    CAST(profile_id AS STRING) AS profile_id,
    metric_id,
    datetime AS occurred_at,
    properties
  FROM {{ source('klaviyo_raw', 'events') }}
  {% if is_incremental() %}
    WHERE datetime > (SELECT MAX(occurred_at) FROM {{ this }})
  {% endif %}
),

with_metric AS (
  SELECT
    b.*,
    m.name AS metric_name
  FROM base b
  LEFT JOIN {{ source('klaviyo_raw', 'metrics') }} m
    ON b.metric_id = m.id
),

extracted AS (
  SELECT
    *,
    -- Extract the fields that matter per metric
    CASE metric_name
      WHEN 'Placed Order' THEN JSON_EXTRACT_SCALAR(properties, '$.OrderId')
      WHEN 'Placed Order Refunded' THEN JSON_EXTRACT_SCALAR(properties, '$.OrderId')
      ELSE NULL
    END AS shopify_order_id,
    CASE metric_name
      WHEN 'Clicked Email' THEN JSON_EXTRACT_SCALAR(properties, '$.CampaignId')
      WHEN 'Opened Email' THEN JSON_EXTRACT_SCALAR(properties, '$.CampaignId')
      ELSE NULL
    END AS campaign_id,
    CAST(JSON_EXTRACT_SCALAR(properties, '$.Total') AS NUMERIC)
      AS order_total
  FROM with_metric
)

SELECT * FROM extracted

The extracted columns become join keys for intermediate models. stg_klaviyo__events joined to stg_shopify__orders on shopify_order_id gives you attribution of orders to email flows.

Close-up of a weathered stone surface where yellow-green moss meets bare rock — // where moss meets stone

Reconciliation check

Klaviyo and Shopify should mostly agree on order count for brands that use the Klaviyo + Shopify integration. "Mostly" is doing real work there. The reconciliation query I run weekly:

WITH klaviyo_orders AS (
  SELECT
    DATE(occurred_at) AS order_date,
    COUNT(DISTINCT shopify_order_id) AS klaviyo_order_count
  FROM {{ ref('stg_klaviyo__events') }}
  WHERE metric_name = 'Placed Order'
    AND shopify_order_id IS NOT NULL
    AND DATE(occurred_at) >= CURRENT_DATE() - 30
  GROUP BY 1
),
shopify_orders AS (
  SELECT
    DATE(created_at) AS order_date,
    COUNT(DISTINCT order_id) AS shopify_order_count
  FROM {{ ref('stg_shopify__orders') }}
  WHERE DATE(created_at) >= CURRENT_DATE() - 30
  GROUP BY 1
)
SELECT
  k.order_date,
  k.klaviyo_order_count,
  s.shopify_order_count,
  ROUND(100.0 * (k.klaviyo_order_count - s.shopify_order_count)
    / s.shopify_order_count, 2) AS variance_pct
FROM klaviyo_orders k
FULL OUTER JOIN shopify_orders s USING (order_date)
ORDER BY order_date DESC;

Healthy variance is Klaviyo 2 to 10 percent below Shopify (Klaviyo misses orders from non-subscribers and from profiles that unsubscribed after order). Anything outside that band is either a broken Klaviyo-Shopify integration or an ingestion gap in your warehouse. Catch the drift in a weekly automation, not on the quarterly review.

“
Email data is a first-class warehouse source. Locking it behind Klaviyo's dashboards is the same mistake as locking Shopify behind Shopify admin.
”

Mountain peak silhouetted against a dusk sky — // peak at last light

FAQ

Which path should I start with?

Native Klaviyo connector if you are under 100K profiles and can live with 6-hour latency. Fivetran or Airbyte Cloud if you want fewer moving parts and have budget. DIY if you are already running an ingestion service for Shopify and adding Klaviyo is one more adapter.

How far back can I backfill Klaviyo history?

Native connector: 90 days from the time you connect. Fivetran: as far back as your Klaviyo account goes, typically 2 to 3 years of full history. DIY: full history via the API, but the API is rate-limited so a multi-year backfill takes a day or two of careful pagination.

What about Klaviyo CDP (profiles-only) data?

The profiles table has the per-subscriber attributes (custom properties, tags, list memberships at sync time). It is cheaper to query than events and is where most lifecycle-segmentation logic lives in dbt. Sync it at the same cadence as events.

Can I write back to Klaviyo from the warehouse?

Yes, via Reverse ETL (Hightouch or Census) or via Klaviyo's own Catalog and Profile APIs. Useful for pushing computed segments back into Klaviyo for campaign targeting. Expect an extra $200 to $500 per month if you use Hightouch or Census.

How do I handle GDPR-deleted profiles?

Klaviyo soft-deletes on GDPR request by clearing PII fields but leaving the profile_id. Your dbt staging model should filter out profiles where email and phone are both NULL. For full right-to-erase, re-sync the profile table and the deleted profile will appear with blanked PII automatically.

What to try this week

Connect Klaviyo's native BigQuery connector to a fresh dataset. Let it sync for 24 hours. Run this query:

SELECT
  m.name AS metric_name,
  COUNT(*) AS events,
  COUNT(DISTINCT profile_id) AS profiles
FROM klaviyo_raw.events e
LEFT JOIN klaviyo_raw.metrics m ON e.metric_id = m.id
GROUP BY 1
ORDER BY 2 DESC
LIMIT 20;

You will see the top metric names and counts. If "Placed Order" is much lower than what Shopify shows, the Klaviyo-Shopify integration is misfiring. If "Opened Email" is missing entirely, you do not have open-tracking enabled.

If you are scoping a full warehouse rebuild, a DTC Stack Audit runs this check against your live Klaviyo and Shopify and tells you whether the ETL path you pick should be thin or thick.

Ultra-wide distant valley between mountain ranges at dusk — // valley between ranges

Sources and specifics

Klaviyo native BigQuery connector docs: help.klaviyo.com (search "BigQuery integration").
Klaviyo API docs: developers.klaviyo.com, rate limits published there.
Fivetran Klaviyo connector: fivetran.com/docs/connectors/applications/klaviyo.
Airbyte Klaviyo source: docs.airbyte.com/integrations/sources/klaviyo.
The DIY pattern and reconciliation query above are from a Q1 2026 analytics engine rebuild for a mid-market DTC operator.