-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-45522: [Parquet][C++] Parquet GEOMETRY and GEOGRAPHY logical type implementations #45459
base: main
Are you sure you want to change the base?
GH-45522: [Parquet][C++] Parquet GEOMETRY and GEOGRAPHY logical type implementations #45459
Conversation
Co-authored-by: Gang Wu <[email protected]>
One separate thing that would be nice to do as a follow up is adding a fuzzing test for the new parsers. |
@emkornfield @pitrou @wgtmac Thank you for your reviews! I think I managed to get the spirit of all of your comments throughout the diff, although I am sure I missed something and obviously feel free to let me know what it was! I think the outstanding issues are:
(Or feel free to add something to this list!) |
Rationale for this change
The GEOMETRY and GEOGRAPHY logical types are being proposed as an addition to the Parquet format.
What changes are included in this PR?
This is a continuation of @Kontinuation 's initial PR (#43977) implementing apache/parquet-format#240 , which included:
Changes after this were:
In order to write test files, I also:
Those last two are probably a bit much for this particular PR, and I'm happy to move them.
Some things that aren't in this PR (but should be in this one or a future PR):
max > min
(and generally make sure the stats for geography are written for trivial cases)Are these changes tested?
Yes!
Are there any user-facing changes?
Yes!
Example from the included Python bindings: