Skip to main content

BigQuery Reference

Reference material for BigQuery Publishing -- data type mappings, CLI arguments, querying, and troubleshooting.

Data Type Mapping

Source database types are mapped to BigQuery types automatically.

PostgreSQL to BigQuery

PostgreSQL TypeBigQuery Type
INTEGER, INT4INT64
BIGINT, INT8INT64
SMALLINT, INT2INT64
NUMERIC(p,s)NUMERIC
REAL, FLOAT4FLOAT64
DOUBLE PRECISIONFLOAT64
VARCHAR(n), TEXTSTRING
DATEDATE
TIMESTAMPTIMESTAMP
TIMESTAMPTZTIMESTAMP
TIMETIME
BOOLEANBOOL
BYTEABYTES
JSON, JSONBJSON
UUIDSTRING

SQL Server to BigQuery

SQL Server TypeBigQuery Type
INTINT64
BIGINTINT64
SMALLINTINT64
TINYINTINT64
DECIMAL(p,s)NUMERIC
MONEYNUMERIC
FLOATFLOAT64
REALFLOAT64
VARCHAR(n), NVARCHAR(n)STRING
DATEDATE
DATETIME, DATETIME2DATETIME
DATETIMEOFFSETTIMESTAMP
TIMETIME
BITBOOL
VARBINARYBYTES
UNIQUEIDENTIFIERSTRING

Oracle to BigQuery

Oracle TypeBigQuery Type
NUMBERNUMERIC
NUMBER(p,s)NUMERIC
FLOATFLOAT64
VARCHAR2, NVARCHAR2STRING
CLOB, NCLOBSTRING
DATETIMESTAMP
TIMESTAMPTIMESTAMP
RAW, BLOBBYTES

CLI Reference

BigQuery Publishing Arguments

OptionTypeDescription
--publish_target IDStringCredential ID for BigQuery publishing (required)
--publish_schema_pattern PATTERNStringDynamic dataset naming pattern (default: {schema})
--publish_table_pattern PATTERNStringDynamic table naming pattern (default: {table})
--publish_method TYPEStringTable type: external (default) or internal
--n_jobs NIntegerParallel workers for table creation (default: 1)

Querying BigQuery Tables

BigQuery Console:

SELECT * FROM `my-project.lx_tpch_1.customer` LIMIT 10;

bq CLI:

bq query --use_legacy_sql=false \
'SELECT * FROM `my-project.lx_tpch_1.customer` LIMIT 10'

Python (google-cloud-bigquery):

from google.cloud import bigquery

client = bigquery.Client()
query = "SELECT * FROM `my-project.lx_tpch_1.customer` LIMIT 10"
df = client.query(query).to_dataframe()
print(df)

pandas-gbq:

import pandas_gbq

query = "SELECT * FROM `my-project.lx_tpch_1.customer` LIMIT 10"
df = pandas_gbq.read_gbq(query, project_id="my-project")
print(df)

Troubleshooting

Common Issues

"Permission denied" errors:

  • Verify the service account has BigQuery Data Editor role
  • Check GCS bucket access from the service account
  • Ensure project ID in credentials matches the target project

"Dataset not found" errors:

  • Check that location matches your GCS bucket region
  • Cross-region access between GCS and BigQuery may cause issues

"Invalid table" errors for external tables:

  • Verify GCS path contains valid Parquet files
  • Check schema mapping for your data types

Verifying Setup

Test BigQuery connectivity:

lakexpress -a credentials.json \
--lxdb_auth_id lxdb \
--source_db_auth_id postgres_prod \
--source_schema_name public \
--target_storage_id gcs_datalake \
--fastbcp_dir_path /path/to/FastBCP \
--publish_target bigquery_prod \
--tables customer \
--dry_run

Validate credentials:

from google.cloud import bigquery
from google.oauth2 import service_account

credentials = service_account.Credentials.from_service_account_file(
'/path/to/service-account.json'
)
client = bigquery.Client(credentials=credentials, project='my-project')
print(list(client.list_datasets()))

See Also

Copyright © 2026 Architecture & Performance.