Skip to content

AWS Glue - Null values in partition column #3165

Open
@jmklix

Description

@jmklix

Original discussion: #2803

It seems like glue isn't handling the getpartitions API correctly where a partition column has null value.
Example below, I am using the aws cli for simiplicity , which gives the same output as the SDK

My table data is structured as below in S3

s3://example-bucket/example_table/
├── int_partition_col=null/
│   ├── string_partition_col=null/
│   │   └── data-part-00001.csv
├── int_partition_col=1/
│   ├── string_partition_col=A/
│   │   └── data-part-00002.csv
└── int_partition_col=2/
    ├── string_partition_col=B/
    │   └── data-part-00003.csv

> aws glue get-partitions --database-name example_db --table-name example_table --expression "(int_partition_col >= 0)" ->
An error occurred (InvalidStateException) when calling the GetPartitions operation: For input string: "null" is not an integer.

> aws glue get-partitions --database-name example_db --table-name example_table --expression "(string_partition_col is null)" -> Returns empty

> aws glue get-partitions --database-name example_db --table-name example_table --expression "(string_partition_col = 'null')"-> works correctly

So it seems like the null value is being considered as a string literal? But from the documentation here, it seems IS NULL etc are supported?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.gluep3This is a minor priority issueservice-apiThis issue is due to a problem in a service API, not the SDK implementation.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions