Loading Strategies
Append Strategy (Default)
Inserts new rows into the target table. Suitable when:
- Your target handles duplicates (e.g., external tables)
- You're building an append-only data lake
- You dedup at query time
--incremental_table "events.pageviews:event_time:timestamp!append"
Upsert Strategy (MERGE)
Uses MERGE to update existing rows and insert new ones. Requires:
- Primary keys defined on the table (stored in the LakeXpress DB)
- A target platform supporting MERGE (Snowflake, Databricks, Fabric)
- Managed/internal tables (not external tables)
--incremental_table "sales.orders:updated_at:datetime!upsert"
How it works:
- New data loads into a staging table
- MERGE runs: matching rows are updated, new rows are inserted
- Staging table is dropped
Benefits:
- No duplicate rows in target
- Handles both inserts and updates
- Target always reflects current state
Requirements:
- Tables must have primary key columns defined in the source
- Use
--publish_method internalor--publish_method managed - Primary keys are automatically detected from the source database schema
Example with Snowflake:
./LakeXpress config create \
-a credentials.json \
--lxdb_auth_id lxdb_postgres \
--source_db_auth_id source_postgres \
--source_db_name ecommerce \
--source_schema_name public \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--incremental_table "public.orders:updated_at:datetime!upsert" \
--incremental_table "public.customers:updated_at:datetime!upsert" \
--publish_target snowflake_prod \
--publish_method internal \
--n_jobs 4
Example with Databricks:
./LakeXpress config create \
-a credentials.json \
--lxdb_auth_id lxdb_sqlite \
--source_db_auth_id source_mssql \
--source_db_name sales \
--source_schema_name dbo \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--incremental_table "dbo.products:modified_date:datetime!upsert" \
--publish_target databricks_unity \
--databricks_table_type managed \
--n_jobs 4
Choosing a Strategy
| Scenario | Recommended Strategy |
|---|---|
| Append-only event logs | append |
| CDC (Change Data Capture) | upsert |
| Transaction history | append |
| Customer/product master data | upsert |
| Time-series metrics | append |
| Order status updates | upsert |
See Also
- Incremental Sync Overview - Configuration syntax and supported column types
- Examples - Complete step-by-step examples and real-world scenarios
- Troubleshooting - Handling duplicates and other common issues