-
Notifications
You must be signed in to change notification settings - Fork 93
How to materialise injected partition projections? #454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't fully understand the last part of your issue. In order to us partition projection we need to setup table properties e.g.
this means that we could extend the adatper to setup those properties - e.g. as we do for iceberg. I think that we support something like that -> https://github.com/dbt-athena/dbt-athena/blob/main/dbt/include/athena/macros/materializations/models/table/create_table_as.sql#L13C48-L13C64 at least for full loads. I believe that we need to add something for incremental loads -as it seems totally missing https://github.com/dbt-athena/dbt-athena/blob/main/dbt/include/athena/macros/materializations/models/incremental/incremental.sql |
@nicor88, the problem is with the source table having an The range of values for an injected column is in S3 prefixes for that column. Similarly to a "100 partitions fix" or "insert_by_period" where materialisation happens in batches, it can be done for partition projection where S3 prefixes are enumerated and injected into a query replacing some INJECTED_FILTER placeholder with another batch of Is there any better way you could think of to materialize from injected column without implementing the above or giving up on injected partitions? |
AH! I think that I got it now, but let me correct if I'm wrong, you are not able to query entirely the source table because you use partition projections, therefore you need to have specific where statements. Did you considered to use a custom macro that first spit out the partition values, then it assemble a query based on those values? |
* Temporary dev-requirements * Changelog entry * Implementations and tests for array macros * Remove `cast_array_to_string` macro * Restore original dev-requirements.txt
* Rename flag and get autocommit on * Add changie * Add test for issue 451 * Add test for issue #454 * Fix spelling. * Revert names * update tests * update the comment to use pep for rationale --------- Co-authored-by: Mila Page <[email protected]>
* convert custom_schema tests to functional * Replace bigserial with bigint * nit newline * import ordering * Pr comments cleanup * scope of setUp to function * Simplify seeds addition * Nits
Hi, I am trying to use dbt over hive Athena tables with various types of partition projections. Particularly for
injected
partitions. The query requires at least one exact value for that partition in WHERE clause. I understand there could be some incremental materialisation that splits available values into chucks and generates multiple insert statements, similar to how 100 partition problem was solved. Does anything like that exist? Is there any other way?The text was updated successfully, but these errors were encountered: