-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: overhaul SQL string reformatting and Producer/Consumer interfac…
…es (#128) This PR does two someone independent changes that makes writing test cases simpler and more robust: an overhaul of the `Producer` and `Consumer` interfaces and a change in how the format arguments of named tables and local files in SQL strings are specified. The two changes have been tied together into one PR because changing the arguments alone proved to be difficult due to the previous brittleness of the two interfaces. The change related to the format arguments consists in the following: instead of specifying a list of `local_files` for each test case, each of which would then either be loaded into a table whose name is derived from the corresponding file name *or* would be processed as is if the SQL string contained the magic works `read_parquet`, local files and named tables are now specified independently of each other and both are specified as a dict: the value of each entry corresponds to the placeholder used in the format string (such as '{customer}') and the value consists of the local file path (such as `customer_small.parquet`). For named tables, the idea is that the corresponding system loads the local file into a table with the given name; local files are processed directly. Since the definition of test cases is used to create parametrized test fixtures, this change involves all test functions uses these parametrized fixtures. As another consequence of this change, some plan snapshots change: some table names now don't have the `_small` suffix anymore because the table name is specified explicitly rather than being derived from the file name and in one case the order of the input tables in the `FROM` clause has changed (the new order corresponds to the one in the official TPC-H query wheras the previous order didn't). The change related to the `Producer` and `Consumer` interfaces simplifies how consumers are created and used. First, both interfaces now have a `setup` method implemented by the interface which takes care of expanding the relative file paths into absolute ones. This removes the need to do that expansion in various other places. Similarly, `Producer.format_sql` takes care of replacing format arguments such that derived classes don't have to. Again in the same spirit, `Producer.produce_substrait` also takes care of formatting the SQL query such that call sites can directly call that function instead of having to remember to reformat the SQL string beforehand. The PR also replaces some direct usages of the DuckDB connection with more high-level usages of `DuckDBConsumer`, such that the encapsulated functionaly described above can be used. To that aim, that class also gets a new method `run_sql_query`. Finally, the PR also removes some duplicate or unused code related to loading local files and formattting queries. I have manually checked and there are now tests that change their fail/pass status compared to the current `main`. Signed-off-by: Ingo Müller <[email protected]>
- Loading branch information
1 parent
fda68c9
commit d4f2aa7
Showing
186 changed files
with
1,622 additions
and
1,176 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.