Skip to content

Exasol: IMPORT/EXPORT #2256

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

ssteinhauser
Copy link
Contributor

This PR adds support for Exasols IMPORT and EXPORT statements:
https://docs.exasol.com/db/latest/sql/import.htm
https://docs.exasol.com/db/latest/sql/export.htm

The JMH benchmark looks good on my machine:

Benchmark                               (version)  Mode  Cnt    Score    Error  Units
JSQLParserBenchmark.parseSQLStatements     latest  avgt   15  119.730 ± 35.438  ms/op
JSQLParserBenchmark.parseSQLStatements        5.3  avgt   15  244.662 ± 99.050  ms/op
JSQLParserBenchmark.parseSQLStatements        5.1  avgt   15  185.566 ± 76.161  ms/op

@manticore-projects
Copy link
Contributor

@ssteinhauser:

Thank you for your contribution, I will look into it over the weekend.
Though I have a question regarding your JMH: can you please share the specs of your machine?

a) your tests look super slow in general (I am recording only 80 ms on a cheap laptop)
b) your tests record a huge improvement over 5.3 (which I could not explain)
c) your tests have very high variances

Not a real concern or show stopper, I am just curious!

@ssteinhauser ssteinhauser force-pushed the feature/exasol-import-export branch from 4db2a71 to 40f916d Compare June 5, 2025 11:33
@ssteinhauser
Copy link
Contributor Author

@ssteinhauser:

Thank you for your contribution, I will look into it over the weekend. Though I have a question regarding your JMH: can you please share the specs of your machine?

a) your tests look super slow in general (I am recording only 80 ms on a cheap laptop) b) your tests record a huge improvement over 5.3 (which I could not explain) c) your tests have very high variances

Not a real concern or show stopper, I am just curious!

Yes, the benchmark results are super high. Probably because I had quite a few other tools and tasks running consuming lots of CPU and memory. However, my specifications are 32GB of RAM and an Intel Core i7-1185G7 (3.00GHz) CPU.

I've just reexecuted the benchmark (inside WSL) while having lower load on the machine, but the results are still a bit high although the variances are lower:

Benchmark                               (version)  Mode  Cnt    Score    Error  Units
JSQLParserBenchmark.parseSQLStatements     latest  avgt   15  140.343 ± 26.198  ms/op
JSQLParserBenchmark.parseSQLStatements        5.3  avgt   15  103.779 ± 20.458  ms/op
JSQLParserBenchmark.parseSQLStatements        5.1  avgt   15  119.490 ± 27.174  ms/op

@manticore-projects
Copy link
Contributor

I have benchmarked your branch on my laptop and get more reasonable results:

Benchmark                               (version)  Mode  Cnt   Score   Error  Units
JSQLParserBenchmark.parseSQLStatements     latest  avgt   15  85.965 ± 2.753  ms/op
JSQLParserBenchmark.parseSQLStatements        5.3  avgt   15  81.911 ± 3.433  ms/op
JSQLParserBenchmark.parseSQLStatements        5.1  avgt   15  82.380 ± 2.856  ms/op

Although there is a deterioration in performance I think we can accept that.
Question: Could we introduce a Feature switch to avoid parsing any of your statements, unless explicitly requested for?

@ssteinhauser
Copy link
Contributor Author

Hi @manticore-projects ,
thanks for your tests and feedback as well.
Sure we can add a feature flag. Which kind of flag are you thinking of? One flag for IMPORT/EXPORT or something else? Should its default be false or true?

@manticore-projects
Copy link
Contributor

Good Morning Stefan,

the decision is all yours because you understand your implementation best.
My take is:

  • you added lots of Exasol specific Grammar (kudos!)
  • this Grammar slows down the Parser somewhat (expected and justified)
  • this Grammar does not serve most of the users (unless they want Exasol)

Best way would be to have RDBMS specific parsers and I will look for ways to achieve that soon.
In the meantime, I wonder if we can introduce Features like 'DIALECT_EXASOL' or 'RDBMS_EXASOL' and then wrap all/most of your extensions into semantic lookaheads like:

LOOKAHEAD( { "DIALECT_EXASOL".equals(Features.get("dialect")) } ) production = ExasolProduction()

(This is an abstract illustration only, of course.)

My hope and goal was to reduce the performance penalty for common RDMS and use-cases to a minimum.
And to have a template for similar exotic dialects like apache hive etc.

@manticore-projects
Copy link
Contributor

manticore-projects commented Jun 6, 2025

Example: line 4663/4664

 LOOKAHEAD(2) fromItem=SubImport() { fromItem = new ParenthesedFromItem(fromItem); }     

could be changed into:

 LOOKAHEAD(2,  { "DIALECT_EXASOL".equals(Features.get("dialect")) } ) fromItem=SubImport() { fromItem = new ParenthesedFromItem(fromItem); }     

@manticore-projects
Copy link
Contributor

You have introduced new tokens RTRIM and LTRIM, please double-check that those won't collide with parsing the normal functions.

Beside that we are good to merge!

@manticore-projects
Copy link
Contributor

Greetings!

I have added already a new Feature for dialect which you can set to Dialect.EXASOL.
This feature can be used now in your lookaheads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants