Skip to main content

LakeXpress

What is LakeXpress?

LakeXpress is a CLI tool that exports database tables to partitioned Parquet files and publishes them to cloud data platforms. It uses FastBCP to stream data in parallel without exhausting memory.

Architecture

LakeXpress connects four components in a data pipeline:

Source Database

The database you want to export data from. LakeXpress reads tables and views from the source and converts them to Parquet files.

Supported sources: Oracle, PostgreSQL, SQL Server, MySQL, MariaDB, SAP HANA, Teradata.

Intermediate Storage

Parquet files are written to a storage backend before being registered in a target platform. This can be local disk or cloud object storage.

Supported backends: local filesystem, AWS S3, S3-compatible (MinIO, etc.), Google Cloud Storage, Azure Blob Storage.

Target Platform

Once Parquet files are in storage, LakeXpress registers them as external tables in a cloud data platform or catalog.

Supported platforms: Snowflake, Databricks, Microsoft Fabric, Amazon Redshift, BigQuery, MotherDuck, AWS Glue, DuckLake.

LakeXpress DB

A dedicated database where LakeXpress logs run history, job metadata, and exported file details. This enables tracking, auditing, and resuming failed exports.

Supported databases: PostgreSQL, SQL Server, MySQL, SQLite, DuckDB.

What you'll need

ComponentWhat to doDetails
LakeXpress binaryDownload and unzipInstallation guide
Source database userCreate a read-only user with SELECT on target schemasDatabase setup
LakeXpress DBA new or existing database with a user that has full privilegesDatabase setup
Storage destinationA local directory or cloud storage credentialsStorage config
Publishing target (optional)Credentials for Snowflake, Databricks, etc.Snowflake · Databricks · more...

Key features

  • Cross-platform: Native binaries for Windows and Linux
  • Parallel exports: Multiple tables at once, with per-table partitioning
  • Incremental sync: Watermark-based delta exports
  • Schema filtering: Include/exclude schemas and tables via SQL patterns
  • Resume on failure: Pick up where a failed export left off
  • CDM metadata: Generate Common Data Model files

How it works

  1. Configure -- LakeXpress config create defines your source database, storage target, and optional publishing
  2. Sync -- LakeXpress sync exports tables to Parquet and publishes to your catalog

See the Quick Start Guide for a full walkthrough with real commands.

Next steps

Copyright © 2026 Architecture & Performance.