Skip to content

add python CLI bindings #121

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Apr 19, 2025
Merged

Conversation

kevinjqliu
Copy link
Collaborator

@kevinjqliu kevinjqliu commented Apr 14, 2025

Closes #113

This PR creates the tpchgen-cli pypi package using maturin and exposes the rust binary as python executable.
It also adds the .github/workflows/tpchgen-cli-publish-pypi.yml Github Action to publish to pypi on new release.

This PR allows tpchgen-cli to be installed via pip and used like so:

pip install tpchgen-cli
tpchgen-cli --help

.github/workflows/tpchgen-cli-publish-pypi.yml is generated using maturin generate-ci github and then edited.
The github action is tested using my fork and published the package to testpypi.
Try it out:

pip install -i https://test.pypi.org/simple/ tpchgen-cli
tpchgen-cli --help

I added both kevinjqliu/tpchgen-rs and clflushopt/tpchgen-rs as trusted publishers in testpypi
Screenshot 2025-04-17 at 6 27 51 PM

I think we should merge this PR as is, pointing to testpypi. Run the github action on main, verify that it can publish to testpypi, and then edit the github action to publish to pypi.

Dev Env

Setup

cd tpchgen-cli
python -m venv .venv                                   
source .venv/bin/activate
pip install -U pip maturin

Develop

maturin develop --release

Run

tpchgen-cli --help
tpchgen-cli

@kevinjqliu kevinjqliu changed the title [wip] add python CLI bindings add python CLI bindings Apr 14, 2025
@kevinjqliu kevinjqliu marked this pull request as ready for review April 14, 2025 20:57
@kevinjqliu kevinjqliu force-pushed the kevinjqliu/use-python-bin branch from 9d78d54 to f7f3703 Compare April 15, 2025 15:51
@alamb
Copy link
Collaborator

alamb commented Apr 17, 2025

That is pretty sweet.

Maybe we can get a homebrew recipe too ?

@kevinjqliu
Copy link
Collaborator Author

Thanks @alamb I created #126 to track homebrew installation.

Copy link
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank yoU @kevinjqliu -- do you have any idea how to test this? I feel if we merged this PR the CI would likely not work on our next release but I don't know that for sure

@kevinjqliu kevinjqliu force-pushed the kevinjqliu/use-python-bin branch from 1242c80 to 2d395ca Compare April 18, 2025 00:45
@clflushopt
Copy link
Owner

clflushopt commented Apr 18, 2025

This is great @kevinjqliu - thank you ! Btw, do you expect a need for having bindings from the core crate so it's usable as Python library ?

Edit: I'll review and test after the long weekend, I will also setup the publishing workflow via PyPi.

@kevinjqliu kevinjqliu force-pushed the kevinjqliu/use-python-bin branch from e980ba1 to e57865e Compare April 18, 2025 00:53
@kevinjqliu
Copy link
Collaborator Author

Btw, do you expect a need for having bindings from the core crate so it's usable as Python library ?

I dont see a need right now. Personally, I just wanted to pip install it. In the future, we can create python bindings for the core crate and release it under the tpchgen pypi package.

I'll review and test after the long weekend, I will also setup the publishing workflow via PyPi.

Sounds good! No rush.

Do you have a testpypi account? I can add you to the project in testpypi

Copy link
Collaborator Author

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the github action run using my fork and published the package to testpypi.

I added both kevinjqliu/tpchgen-rs and clflushopt/tpchgen-rs as trusted publishers in testpypi

I think we should merge this PR as is, pointing to testpypi. Run the github action on main, verify that it can publish to testpypi, and then edit the github action to publish to pypi.

@kevinjqliu kevinjqliu force-pushed the kevinjqliu/use-python-bin branch 2 times, most recently from 7495b5c to 9643947 Compare April 18, 2025 02:49
Copy link
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinjqliu I think this is a great idea and will expand the reach of tpchgen a lot.

Unfortuantely I don't have time to help manage / test this, but I think we should just give you the permissions to do it yourself:

@alamb alamb merged commit 6ee1df1 into clflushopt:main Apr 19, 2025
14 checks passed
@kevinjqliu kevinjqliu deleted the kevinjqliu/use-python-bin branch April 19, 2025 15:47
messense pushed a commit to PyO3/maturin that referenced this pull request Apr 29, 2025
[`tpchgen`](https://github.com/clflushopt/tpchgen-rs) is a TPC-H data
generator written in rust. Here's more background on its development,
https://datafusion.apache.org/blog/2025/04/10/fastest-tpch-generator/

`maturin` was used to distribute the rust binary as python script, using
[its `bin` bindings feature](https://www.maturin.rs/bindings#bin), with
minimal code change (See
clflushopt/tpchgen-rs#121)

The rust binary cli can now be pip installed,
```
pip install tpchgen-cli
tpchgen-cli -h
```

Took me a while to figure out that `maturin` had first-class support for
something like this. My first attempt was to export the cli functions
manually (See clflushopt/tpchgen-rs#119).
Because of this, I think this use case is a great example to add.

(Following #2529 as example PR)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] add to python ecosystem
3 participants