AWS Glue Reference
Reference material for AWS Glue Publishing -- data type mappings, CLI arguments, and querying.
Data Type Mapping
Source types map automatically to Glue-compatible types.
PostgreSQL to Glue
| PostgreSQL Type | Glue Type |
|---|---|
INTEGER, INT4 | int |
BIGINT, INT8 | bigint |
SMALLINT, INT2 | smallint |
NUMERIC(p,s) | decimal(p,s) |
REAL, FLOAT4 | float |
DOUBLE PRECISION | double |
VARCHAR(n), TEXT | string |
DATE | date |
TIMESTAMP | timestamp |
BOOLEAN | boolean |
BYTEA | binary |
SQL Server to Glue
| SQL Server Type | Glue Type |
|---|---|
INT | int |
BIGINT | bigint |
SMALLINT | smallint |
TINYINT | tinyint |
DECIMAL(p,s) | decimal(p,s) |
FLOAT | double |
REAL | float |
VARCHAR(n), NVARCHAR(n) | string |
DATE | date |
DATETIME, DATETIME2 | timestamp |
BIT | boolean |
VARBINARY | binary |
CLI Reference
Glue Publishing Arguments
| Option | Type | Description |
|---|---|---|
--publish_target ID | String | Credential ID for Glue publishing (required) |
--publish_schema_pattern PATTERN | String | Database naming pattern (default: {schema}) |
--publish_table_pattern PATTERN | String | Table naming pattern (default: {table}) |
--glue_skip_existing | Flag | Skip existing tables instead of drop and recreate |
--n_jobs N | Integer | Parallel workers for table creation (default: 1) |
Querying Glue Tables
Amazon Athena:
SELECT * FROM lx_tpch_1.customer LIMIT 10;
AWS Redshift Spectrum:
SELECT * FROM spectrum_schema.customer LIMIT 10;
Amazon EMR (Spark):
df = spark.table("lx_tpch_1.customer")
df.show(10)
See Also
- AWS Glue Publishing - Setup and usage guide
- CLI Reference - All command-line options