Export Performance
LakeXpress auto-detects the best export method per table. Override when needed using CLI flags or the partition_columns table in the LakeXpress DB.
Export Methods
Without Key Column
| Method | Database | Description |
|---|---|---|
| None | All | Sequential export (no parallelism) |
| Ctid | PostgreSQL | Tuple ID-based parallel export |
| Rowid | Oracle | Row ID-based parallel export |
| Physloc | SQL Server | Physical location-based parallel export |
With Key Column
| Method | Description | Key Column |
|---|---|---|
| Random | Distributes using column values | Integer or bigint |
| DataDriven | Splits on distinct values | Column with distinct values (e.g., year, region) |
| RangeId | Range partitioning via min/max | Numeric or date column |
| Ntile | Even distribution | Numeric column |
Tuning Parameters
| Parameter | Default | Description |
|---|---|---|
--n_jobs | 4 | Tables exported simultaneously |
--fastbcp_p | -4 | Parallel threads per large table |
--large_table_threshold | 100000 | Row count below which tables export sequentially |
Parallel degree guidelines (--fastbcp_p):
- < 100K rows: -4 (default auto-scaled)
- 100K-10M rows: 2-4
- 10M-100M rows: 4-8
- > 100M rows: 8-16
Per-Table CLI Override
--fastbcp_table_config "table:method:key_column:parallel_degree"
Multiple tables separated by semicolons. Empty fields use defaults.
--fastbcp_table_config "lineitem:DataDriven:YEAR(l_shipdate):8;orders:Ctid::4"
Per-Table Database Override
Insert into the partition_columns table for persistent configuration:
INSERT INTO partition_columns (
source_db_type, source_database, source_schema, source_table,
fastbcp_method, fastbcp_parallel_degree, fastbcp_distribution_key
) VALUES (
'postgres', 'analytics', 'public', 'events',
'Ctid', 4, NULL
);
Database configuration takes priority over CLI configuration, which takes priority over auto-detection.
Monitoring
SELECT source_table, status, finished_at - started_at AS duration, row_count
FROM jobs
WHERE run_id = 'your-run-id'
ORDER BY duration DESC;