Skip to content

Lake Formation Table Creation misnames table causing EntityNotFoundException during wr.s3.to_parquet() #579

Closed
@ghost

Description

Describe the bug

def write_table(name):
    wr.s3.to_parquet(
        df=pd.DataFrame({
            'col': [1, 2, 3],
            'col2': ['A', 'A', 'B'],
        }),
        path="s3://lf-hubble-preview/gov_legislators/" + name,
        dataset=True,
        mode='overwrite',
        database='gov_legislators',
        table=name,
        table_type='GOVERNED'
    )

write_table("table_with_isolated_2_number") # Succeeds
write_table("table_with_combined_2NUM_number") # Fails

To Reproduce
Run the code, see table w/ numeral and string concatenated fails with example stack trace:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/fliver/Library/Application Support/JetBrains/IdeaIC2020.3/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/Users/fliver/Library/Application Support/JetBrains/IdeaIC2020.3/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/fliver/workplace/PandasLakeformation/perf/main.py", line 86, in <module>
    write_table("table_with_combined_2NUM_number") # Fails
  File "/Users/fliver/workplace/PandasLakeformation/perf/main.py", line 72, in write_table
    wr.s3.to_parquet(
  File "/Volumes/workplace/PandasLakeformation/.env/lib/python3.9/site-packages/awswrangler/_config.py", line 418, in wrapper
    return function(**args)
  File "/Volumes/workplace/PandasLakeformation/.env/lib/python3.9/site-packages/awswrangler/s3/_write_parquet.py", line 611, in to_parquet
    paths, partitions_values = _to_dataset(
  File "/Volumes/workplace/PandasLakeformation/.env/lib/python3.9/site-packages/awswrangler/s3/_write_dataset.py", line 202, in _to_dataset
    del_objects: List[Dict[str, Any]] = _get_table_objects(
  File "/Volumes/workplace/PandasLakeformation/.env/lib/python3.9/site-packages/awswrangler/lakeformation/_utils.py", line 83, in _get_table_objects
    response = client_lakeformation.get_table_objects(**scan_kwargs)
  File "/Volumes/workplace/PandasLakeformation/.env/lib/python3.9/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Volumes/workplace/PandasLakeformation/.env/lib/python3.9/site-packages/botocore/client.py", line 676, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.EntityNotFoundException: An error occurred (EntityNotFoundException) when calling the GetTableObjects operation: Table table_with_combined_2num_number not found.

If you do a wr.catalog.tables(database="gov_legislators"), you'll find the 2nd table was created with an extra underscore, 'table_with_combined_2_num_number', and the code subsequently fails in the middle of creation, trying to GetTableObjects, due to the munged table name.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingminor releaseWill be addressed in the next minor release

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions