Skip to content

feat: Add TPC-H queries #140

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 21, 2025
Merged

feat: Add TPC-H queries #140

merged 3 commits into from
Apr 21, 2025

Conversation

clflushopt
Copy link
Owner

@clflushopt clflushopt commented Apr 21, 2025

Add support for the TPC-H queries.

Copy link
Collaborator

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are slightly different from the duckdb tpch queries

For example,
Q1 filters for a different date
Q2 is parameterized here but not in duckdb. duckdb also has a limit clause
Q3 is also parameterized. duckdb query has different date and a limit clause.

I see datafusion-dft added the duckdb queries verbatim
https://github.com/datafusion-contrib/datafusion-dft/pull/322/files#diff-5ce4e8b341f384f97ab3a8b517c11ce98ff4377d006913c28cfb1976256c97a7R18

@clflushopt
Copy link
Owner Author

I used the queries from this repository https://github.com/electrum/tpch-dbgen/tree/master/queries

@kevinjqliu
Copy link
Collaborator

I used the queries from this repository https://github.com/electrum/tpch-dbgen/tree/master/queries

yep looks like those are the correct queries. I cross checked with the docs, https://www.tpc.org/TPC_Documents_Current_Versions/pdf/TPC-H_v3.0.1.pdf

In section 2.4 Query Definitions

Copy link
Collaborator

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@clflushopt
Copy link
Owner Author

In the future if the need arises we can expose the DuckDB compatible ones maybe in a nested module.

@clflushopt clflushopt merged commit 744030a into main Apr 21, 2025
7 checks passed
@alamb alamb deleted the cl/feat/add-tpch-queries branch April 28, 2025 20:30
@alamb
Copy link
Collaborator

alamb commented Apr 28, 2025

This is sweet. Maybe it is time to start preparing for a 1.1 release 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants