Open
Description
Original discussion: #2803
It seems like glue isn't handling the getpartitions API correctly where a partition column has null value.
Example below, I am using the aws cli for simiplicity , which gives the same output as the SDK
My table data is structured as below in S3
s3://example-bucket/example_table/
├── int_partition_col=null/
│ ├── string_partition_col=null/
│ │ └── data-part-00001.csv
├── int_partition_col=1/
│ ├── string_partition_col=A/
│ │ └── data-part-00002.csv
└── int_partition_col=2/
├── string_partition_col=B/
│ └── data-part-00003.csv
> aws glue get-partitions --database-name example_db --table-name example_table --expression "(int_partition_col >= 0)" ->
An error occurred (InvalidStateException) when calling the GetPartitions operation: For input string: "null" is not an integer.
> aws glue get-partitions --database-name example_db --table-name example_table --expression "(string_partition_col is null)" -> Returns empty
> aws glue get-partitions --database-name example_db --table-name example_table --expression "(string_partition_col = 'null')"-> works correctly
So it seems like the null value is being considered as a string literal? But from the documentation here, it seems IS NULL etc are supported?