Version: 0.4 (Latest)

Quick start guide

Two steps: create a config, then run a sync.

Preparation

1. Install LakeXpress

Download the LakeXpress archive for your platform and unzip it. FastBCP is bundled in the archive. See the Installation guide for details.

2. Prepare the source database

LakeXpress needs read-only access to export tables from your source database. Create a dedicated user with SELECT privileges on the schemas you want to export.

PostgreSQL example:

CREATE USER lakexpress_reader WITH PASSWORD 'your_password';
GRANT USAGE ON SCHEMA public TO lakexpress_reader;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO lakexpress_reader;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
  GRANT SELECT ON TABLES TO lakexpress_reader;

For other databases, see the Database Configuration page (Oracle, SQL Server, MySQL, MariaDB, SAP HANA).

3. Prepare the LakeXpress DB

LakeXpress needs its own database to store configuration, track syncs, and log runs. It creates the schema automatically on first run -- you just need an empty database with a user that has full privileges.

PostgreSQL example:

CREATE DATABASE lakexpress_log;
CREATE USER lakexpress_admin WITH PASSWORD 'your_password';
GRANT ALL PRIVILEGES ON DATABASE lakexpress_log TO lakexpress_admin;

Quick testing

For quick testing, SQLite or DuckDB need no server -- just point to a file path in your credentials file.

For SQL Server or MySQL alternatives, see Database Configuration.

4. Set up storage

For the simplest setup, create a local directory to receive the exported Parquet files:

mkdir ./exports

For cloud storage (S3, GCS, Azure Blob), see Storage Overview.

Create a credentials file

{
  "lxdb_postgres": {
    "ds_type": "postgres",
    "auth_mode": "classic",
    "info": {
      "username": "$env{LX_LXDB_USER}",
      "password": "$env{LX_LXDB_PASSWORD}",
      "server": "localhost",
      "port": 5432,
      "database": "lakexpress_log"
    }
  },
  "source_postgres": {
    "ds_type": "postgres",
    "auth_mode": "classic",
    "info": {
      "username": "$env{LX_PG_USER}",
      "password": "$env{LX_PG_PASSWORD}",
      "server": "localhost",
      "port": 5432,
      "database": "production_db"
    }
  },
  "s3_01": {
    "ds_type": "s3",
    "auth_mode": "profile",
    "info": {
      "directory": "s3://my-data-lake/exports",
      "profile": "your-aws-profile"
    }
  }
}

Save as credentials.json in a secure location.

Environment variables

Use $env{VAR_NAME} in any string value. An error is raised if the variable is not set. Plain-text values also work.

# Linux
export LX_PG_USER="postgres"
export LX_PG_PASSWORD="your_password"

# Windows (cmd)
set LX_PG_USER=postgres
set LX_PG_PASSWORD=your_password

# Windows (PowerShell)
$env:LX_PG_USER = "postgres"
$env:LX_PG_PASSWORD = "your_password"

For other databases (Oracle, SQL Server, MySQL) and storage backends (GCS, Azure), see Database Configuration and Intermediate Storage.

Initialize the LakeXpress DB (optional)

The LakeXpress DB tracks syncs, runs, and table exports. LakeXpress creates the schema automatically on first sync, so this step is optional.

Use lxdb init to verify connectivity or pre-create the schema for audit purposes.

Windows (PowerShell)

.\LakeXpress.exe lxdb init `
  -a credentials.json `
  --lxdb_auth_id lxdb_postgres

Linux

./LakeXpress lxdb init \
  -a credentials.json \
  --lxdb_auth_id lxdb_postgres

Create a sync configuration

The configuration is stored in the LakeXpress DB and reused for every sync.

Export to local filesystem

Windows (PowerShell)

.\LakeXpress.exe config create `
  -a credentials.json `
  --lxdb_auth_id lxdb_postgres `
  --source_db_auth_id source_postgres `
  --source_db_name public `
  --source_schema_name public `
  --fastbcp_dir_path .\FastBCP_win-x64\latest\ `
  --output_dir .\exports `
  --n_jobs 4 `
  --fastbcp_p 2

Linux

./LakeXpress config create \
  -a credentials.json \
  --lxdb_auth_id lxdb_postgres \
  --source_db_auth_id source_postgres \
  --source_db_name public \
  --source_schema_name public \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --output_dir ./exports \
  --n_jobs 4 \
  --fastbcp_p 2

Exports all tables from public to ./exports/public/table_name/, 4 tables in parallel with 2-way partitioning.

Run the sync

info

The sync command requires the same -a and --lxdb_auth_id used in config create, plus --sync_id (printed by config create).

Windows (PowerShell)

.\LakeXpress.exe sync `
  -a credentials.json `
  --lxdb_auth_id lxdb_postgres `
  --sync_id <SYNC_ID>

Linux

./LakeXpress sync \
    -a credentials.json \
    --lxdb_auth_id lxdb_postgres \
    --sync_id <SYNC_ID>

Loads the config from the LakeXpress DB, exports tables, and shows real-time progress.

More examples

Export to cloud storage

Export to AWS S3 with CDM metadata:

Windows (PowerShell)

.\LakeXpress.exe config create `
  -a credentials.json `
  --lxdb_auth_id lxdb_postgres `
  --source_db_auth_id source_postgres `
  --source_db_name tpch `
  --source_schema_name public `
  --fastbcp_dir_path .\FastBCP_win-x64\latest\ `
  --target_storage_id s3_01 `
  --n_jobs 4 `
  --fastbcp_p 2 `
  --generate_metadata

.\LakeXpress.exe sync `
  -a credentials.json `
  --lxdb_auth_id lxdb_postgres `
  --sync_id <SYNC_ID>

Linux

./LakeXpress config create \
  -a credentials.json \
  --lxdb_auth_id lxdb_postgres \
  --source_db_auth_id source_postgres \
  --source_db_name tpch \
  --source_schema_name public \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --target_storage_id s3_01 \
  --n_jobs 4 \
  --fastbcp_p 2 \
  --generate_metadata

./LakeXpress sync \
    -a credentials.json \
    --lxdb_auth_id lxdb_postgres \
    --sync_id <SYNC_ID>

Exports to S3 and generates CDM metadata files.

Filter tables with patterns

Use include/exclude patterns to select specific tables:

Windows (PowerShell)

.\LakeXpress.exe config create `
  -a credentials.json `
  --lxdb_auth_id lxdb_postgres `
  --source_db_auth_id source_postgres `
  --source_db_name public `
  --source_schema_name public `
  --include "orders%, customer%, product%" `
  --exclude "temp%, test%" `
  --fastbcp_dir_path .\FastBCP_win-x64\latest\ `
  --output_dir .\exports `
  --n_jobs 4

.\LakeXpress.exe sync `
  -a credentials.json `
  --lxdb_auth_id lxdb_postgres `
  --sync_id <SYNC_ID>

Linux

./LakeXpress config create \
  -a credentials.json \
  --lxdb_auth_id lxdb_postgres \
  --source_db_auth_id source_postgres \
  --source_db_name public \
  --source_schema_name public \
  --include "orders%, customer%, product%" \
  --exclude "temp%, test%" \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --output_dir ./exports \
  --n_jobs 4

./LakeXpress sync \
    -a credentials.json \
    --lxdb_auth_id lxdb_postgres \
    --sync_id <SYNC_ID>

Includes tables matching orders%, customer%, or product%; excludes those matching temp% or test%.

Incremental sync

Use a watermark column so subsequent syncs only export new rows:

./LakeXpress config create \
  -a credentials.json \
  --lxdb_auth_id lxdb_ms \
  --source_db_auth_id source_pg \
  --source_db_name tpch \
  --source_schema_name tpch_1_incremental \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --target_storage_id s3_01 \
  --incremental_table "tpch_1_incremental.orders:o_orderdate:date" \
  --incremental_table "tpch_1_incremental.lineitem:l_shipdate:date" \
  --generate_metadata

# First sync: exports everything and records high watermarks
./LakeXpress sync \
    -a credentials.json \
    --lxdb_auth_id lxdb_ms \
    --sync_id <SYNC_ID>

# Later syncs: only exports rows past the watermark
./LakeXpress sync \
    -a credentials.json \
    --lxdb_auth_id lxdb_ms \
    --sync_id <SYNC_ID>

Tables not configured as incremental are fully exported each sync. See the Incremental Sync guide for details.

Resume failed syncs

./LakeXpress sync \
  -a credentials.json \
  --lxdb_auth_id lxdb_postgres \
  --sync_id <SYNC_ID> \
  --run_id 20251208-f7g8h9i0-j1k2-l3m4 \
  --resume

Skips completed tables and retries only the failed ones.

Snowflake publishing

Export to S3 and create Snowflake external tables in one step:

./LakeXpress config create \
  -a credentials.json \
  --lxdb_auth_id lxdb_postgres \
  --source_db_auth_id source_postgres \
  --source_db_name public \
  --source_schema_name public \
  --fastbcp_dir_path ./FastBCP_linux-x64/latest/ \
  --target_storage_id s3_01 \
  --publish_target snowflake_prod \
  --n_jobs 4

./LakeXpress sync \
    -a credentials.json \
    --lxdb_auth_id lxdb_postgres \
    --sync_id <SYNC_ID>

Query the data in Snowflake:

SELECT * FROM PUBLIC.V_CUSTOMER LIMIT 10;

For internal tables with primary key constraints, add --publish_method internal --snowflake_pk_constraints. See the Snowflake Publishing Guide.

Reference

List configurations

./LakeXpress config list \
  -a credentials.json \
  --lxdb_auth_id lxdb_postgres

Check sync status

./LakeXpress status -a credentials.json --lxdb_auth_id lxdb_postgres --sync_id <your-sync-id>

Manage the LakeXpress DB

# Initialize the schema
./LakeXpress lxdb init -a credentials.json --lxdb_auth_id lxdb_postgres

# Clear run history (keeps schema)
./LakeXpress lxdb truncate -a credentials.json --lxdb_auth_id lxdb_postgres

# Drop the schema
./LakeXpress lxdb drop -a credentials.json --lxdb_auth_id lxdb_postgres --confirm

Next steps

CLI reference - all available options
Incremental sync - continuous updates
Intermediate Storage - S3, GCS, Azure, local
Database configuration - all supported databases
Examples & Recipes - real-world scenarios

Preparation​

1. Install LakeXpress​

2. Prepare the source database​

3. Prepare the LakeXpress DB​

4. Set up storage​

Create a credentials file​

Initialize the LakeXpress DB (optional)​

Windows (PowerShell)​

Linux​

Create a sync configuration​

Export to local filesystem​

Windows (PowerShell)​

Linux​

Run the sync​

Windows (PowerShell)​

Linux​

More examples​

Export to cloud storage​

Windows (PowerShell)​

Linux​

Filter tables with patterns​

Windows (PowerShell)​

Linux​

Incremental sync​

Resume failed syncs​

Snowflake publishing​

Reference​

List configurations​

Check sync status​

Manage the LakeXpress DB​

Next steps​

Preparation

1. Install LakeXpress

2. Prepare the source database

3. Prepare the LakeXpress DB

4. Set up storage

Create a credentials file

Initialize the LakeXpress DB (optional)

Windows (PowerShell)

Linux

Create a sync configuration

Export to local filesystem

Windows (PowerShell)

Linux

Run the sync

Windows (PowerShell)

Linux

More examples

Export to cloud storage

Windows (PowerShell)

Linux

Filter tables with patterns

Windows (PowerShell)

Linux

Incremental sync

Resume failed syncs

Snowflake publishing

Reference

List configurations

Check sync status

Manage the LakeXpress DB

Next steps