Watermarks
Watermark Tracking
LakeXpress maintains watermarks in the LakeXpress DB:
First Sync (Full Load)
Table: orders
Column: o_orderdate
Watermark: NULL -> 2025-12-31 (highest date in table)
Records exported: 1,500,000
Second Sync (Incremental)
Query: SELECT * FROM orders WHERE o_orderdate > 2025-12-24 - 1 hour
Expected records: 50,000 (new orders in last week)
Watermark updated: 2025-12-31 -> 2026-01-05
Safety Lag
The --incremental_safety_lag parameter handles late-arriving data:
./LakeXpress config create \
... \
--incremental_table "events.raw_events:event_timestamp:datetime" \
--incremental_safety_lag 3600 \
...
--incremental_safety_lag INT- Lag in seconds (default: 0)
Example with 1-hour lag:
Current time: 2025-01-08 14:00:00
Watermark: 2025-01-08 10:00:00
Query includes: WHERE event_timestamp > 2025-01-08 09:00:00
(1 hour before watermark)
When to use:
- Asynchronous systems with delayed writes
- Multi-region databases with replication lag
- Event streams with out-of-order processing
- Financial transactions with settlement delays
Querying Watermarks
Inspect tracked watermarks by querying the LakeXpress DB:
-- View all incremental configurations
SELECT
sync_id,
config_name,
source_table,
incremental_column,
last_watermark,
updated_at
FROM sync_configurations
WHERE is_incremental = true
ORDER BY updated_at DESC;
-- View recent watermark updates
SELECT
run_id,
sync_id,
source_table,
previous_watermark,
new_watermark,
rows_exported,
started_at,
completed_at
FROM incremental_watermarks
ORDER BY completed_at DESC
LIMIT 10;
Resetting Watermarks
To do a full reload:
# Option 1: Delete and recreate the configuration
./LakeXpress config delete \
-a credentials.json \
--lxdb_auth_id lxdb_postgres \
--sync_id 20251208-xxxxx
# Then create a new one
./LakeXpress config create ...
# Option 2: Override watermark on next sync
./LakeXpress sync --reset-watermarks
See Also
- Incremental Sync Overview - Configuration syntax and supported column types
- Examples - Complete step-by-step examples and real-world scenarios
- Troubleshooting - Fixing watermark issues and missing data