Skip to main content

Export Performance

LakeXpress auto-detects the best export method per table. Override when needed using CLI flags or the partition_columns table in the LakeXpress DB.

Export Methods

Without Key Column

MethodDatabaseDescription
NoneAllSequential export (no parallelism)
CtidPostgreSQLTuple ID-based parallel export
RowidOracleRow ID-based parallel export
PhyslocSQL ServerPhysical location-based parallel export

With Key Column

MethodDescriptionKey Column
RandomDistributes using column valuesInteger or bigint
DataDrivenSplits on distinct valuesColumn with distinct values (e.g., year, region)
RangeIdRange partitioning via min/maxNumeric or date column
NtileEven distributionNumeric column

Tuning Parameters

ParameterDefaultDescription
--n_jobs4Tables exported simultaneously
--fastbcp_p-4Parallel threads per large table
--large_table_threshold100000Row count below which tables export sequentially

Parallel degree guidelines (--fastbcp_p):

  • < 100K rows: -4 (default auto-scaled)
  • 100K-10M rows: 2-4
  • 10M-100M rows: 4-8
  • > 100M rows: 8-16

Per-Table CLI Override

--fastbcp_table_config "table:method:key_column:parallel_degree"

Multiple tables separated by semicolons. Empty fields use defaults.

--fastbcp_table_config "lineitem:DataDriven:YEAR(l_shipdate):8;orders:Ctid::4"

Per-Table Database Override

Insert into the partition_columns table for persistent configuration:

INSERT INTO partition_columns (
source_db_type, source_database, source_schema, source_table,
fastbcp_method, fastbcp_parallel_degree, fastbcp_distribution_key
) VALUES (
'postgres', 'analytics', 'public', 'events',
'Ctid', 4, NULL
);

Database configuration takes priority over CLI configuration, which takes priority over auto-detection.

Monitoring

SELECT source_table, status, finished_at - started_at AS duration, row_count
FROM jobs
WHERE run_id = 'your-run-id'
ORDER BY duration DESC;
Copyright © 2026 Architecture & Performance.