Examples
Complete Example
Step 1: Create Configuration with Incremental Tables
./LakeXpress config create \
-a credentials.json \
--lxdb_auth_id lxdb_ms \
--source_db_auth_id ds_04_pg \
--source_db_name tpch \
--source_schema_name tpch_1_incremental \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id aws_s3_01 \
--incremental_table "tpch_1_incremental.orders:o_orderdate:date" \
--incremental_table "tpch_1_incremental.lineitem:l_shipdate:date" \
--incremental_safety_lag 3600 \
--generate_metadata \
--n_jobs 4 \
--fastbcp_p 2
| Parameter | Meaning |
|---|---|
--incremental_table "tpch_1_incremental.orders:o_orderdate:date" | Track orders by o_orderdate column |
--incremental_table "tpch_1_incremental.lineitem:l_shipdate:date" | Track lineitem by l_shipdate column |
--incremental_safety_lag 3600 | Wait 1 hour after the watermark (handles late data) |
--generate_metadata | Generate CDM metadata for exported tables |
Step 2: Run First Sync
./LakeXpress sync
First sync behavior:
- Exports all rows from
ordersandlineitem - Records the highest
o_orderdateandl_shipdateas watermarks - Non-incremental tables are fully exported
- Stores watermarks in the LakeXpress DB
Step 3: Subsequent Syncs
./LakeXpress sync
Subsequent sync behavior:
- Loads previous watermarks from the LakeXpress DB
- Exports only rows where
o_orderdate > previous_watermark - safety_lag - Exports only rows where
l_shipdate > previous_watermark - safety_lag - Updates watermarks with current highest values
- Non-incremental tables are fully exported again
Real-World Scenarios
Scenario 1: Daily Order Processing
./LakeXpress config create \
-a credentials.json \
--lxdb_auth_id lxdb_postgres \
--source_db_auth_id source_postgres \
--source_db_name ecommerce \
--source_schema_name public \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id s3_01 \
--incremental_table "public.orders:created_at:datetime" \
--incremental_table "public.order_items:created_at:datetime" \
--publish_target snowflake_prod \
--n_jobs 4
# Run daily via cron
./LakeXpress sync
- Day 1: Exports 1,000,000 orders (full load)
- Day 2: Exports ~5,000 new orders (incremental)
- Day 3: Exports ~4,800 new orders (incremental)
- Other tables (customers, products) fully exported daily
Scenario 2: Event Log Aggregation
./LakeXpress config create \
-a credentials.json \
--lxdb_auth_id lxdb_ms \
--source_db_auth_id source_postgres \
--source_db_name analytics \
--source_schema_name events \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id aws_s3_01 \
--incremental_table "events.pageviews:event_time:timestamp" \
--incremental_table "events.clicks:event_time:timestamp" \
--incremental_table "events.conversions:event_time:timestamp" \
--incremental_safety_lag 600 \
--sub_path production/events \
--n_jobs 8 \
--fastbcp_p 4
# Run every 10 minutes
./LakeXpress sync
- Ingests events from multiple tables continuously
- 10-minute safety lag handles processing delays
Scenario 3: Time-Series Metrics
./LakeXpress config create \
-a credentials.json \
--lxdb_auth_id lxdb_sqlite \
--source_db_auth_id source_postgres \
--source_db_name monitoring \
--source_schema_name metrics \
--fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
--target_storage_id azure_01 \
--incremental_table "metrics.cpu_usage:recorded_at:timestamp" \
--incremental_table "metrics.memory_usage:recorded_at:timestamp" \
--incremental_table "metrics.disk_io:recorded_at:timestamp" \
--incremental_safety_lag 300 \
--n_jobs 4 \
--generate_metadata
# Run every 5 minutes
./LakeXpress sync
- High-frequency metric collection to Azure storage
- Each sync captures the last 5+ minutes of data
Combining with Snowflake Publishing
./LakeXpress config create \
... \
--incremental_table "sales.orders:order_date:date" \
--incremental_table "sales.returns:return_date:date" \
--target_storage_id s3_01 \
--publish_target snowflake_prod \
--publish_method internal \
--publish_schema_pattern "{schema}_incremental" \
--n_jobs 4
./LakeXpress sync
Creates Snowflake tables continuously updated with new data. Non-incremental tables are fully exported and published on each sync.
See Also
- Incremental Sync Overview - Configuration syntax and supported column types
- Loading Strategies - Append vs upsert strategies
- Watermarks - Watermark tracking and safety lag