Skip to main content

Loading Strategies

Append Strategy (Default)

Inserts new rows into the target table. Suitable when:

  • Your target handles duplicates (e.g., external tables)
  • You're building an append-only data lake
  • You dedup at query time
--incremental_table "events.pageviews:event_time:timestamp!append"

Upsert Strategy (MERGE)

Uses MERGE to update existing rows and insert new ones. Requires:

  • Primary keys defined on the table (stored in the LakeXpress DB)
  • A target platform supporting MERGE (Snowflake, Databricks, Fabric)
  • Managed/internal tables (not external tables)
--incremental_table "sales.orders:updated_at:datetime!upsert"

How it works:

  1. New data loads into a staging table
  2. MERGE runs: matching rows are updated, new rows are inserted
  3. Staging table is dropped

Benefits:

  • No duplicate rows in target
  • Handles both inserts and updates
  • Target always reflects current state

Requirements:

  • Tables must have primary key columns defined in the source
  • Use --publish_method internal or --publish_method managed
  • Primary keys are automatically detected from the source database schema

Example with Snowflake:

./LakeXpress config create \
-a credentials.json \
--lxdb_auth_id lxdb_postgres \
--source_db_auth_id source_postgres \
--source_db_name ecommerce \
--source_schema_name public \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--incremental_table "public.orders:updated_at:datetime!upsert" \
--incremental_table "public.customers:updated_at:datetime!upsert" \
--publish_target snowflake_prod \
--publish_method internal \
--n_jobs 4

Example with Databricks:

./LakeXpress config create \
-a credentials.json \
--lxdb_auth_id lxdb_sqlite \
--source_db_auth_id source_mssql \
--source_db_name sales \
--source_schema_name dbo \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--incremental_table "dbo.products:modified_date:datetime!upsert" \
--publish_target databricks_unity \
--databricks_table_type managed \
--n_jobs 4

Choosing a Strategy

ScenarioRecommended Strategy
Append-only event logsappend
CDC (Change Data Capture)upsert
Transaction historyappend
Customer/product master dataupsert
Time-series metricsappend
Order status updatesupsert

See Also

Copyright © 2026 Architecture & Performance.