Skip to content

Ees 6048 hotfix revert duckdb dependency version #5800

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

duncan-at-hiveit
Copy link
Collaborator

This PR:

  • fixes the import of CSVs with quote-escaped cells (e.g. "Kingston upon Hull, City of") by reverting to the previously-used version of DuckDB.

The change to the new version of Duck would, without further explicit configuration, result in the following errors when attempting to read form such a CSV:

Message: Invalid Input Error: CSV Error on Line: 34226
Original Line: 2024,Week 37,Monday,Local...authority,E92000001,England,E12000003,Yorkshire and The Humber,E06000010,"Kingston upon Hull, City of",810,2024-09-09,Primary,Attendance,Present,All present,37020,x,x
Expected Number of Columns:...

This is likely due to a change in DuckDB's default CSV behaviour since version 1.2.0 (we previously used 1.0.2 and the upgrades set the new version to 1.2.1). From the DuckDB v1.2.0 README:

By default, DuckDB now parses CSVs in so-called strict mode (strict_mode = true)

This hotfix will likely be a temporary stopgap. This long term fix will be to:

  • Upgrade to the latest DuckDB version.
  • Ensure that any reading of CSVs is done with either QUOTE set to " or strict_mode set to false.
  • Add tests to handle CSVs with quote-escaped cells in both the initial import of a data set and a "next" import as well.

…porarily fix the import of CSVs with quote-escaped cells
@duncan-at-hiveit duncan-at-hiveit changed the base branch from dev to master April 16, 2025 08:08
@duncan-at-hiveit duncan-at-hiveit merged commit dbc9e58 into master Apr 16, 2025
11 checks passed
@duncan-at-hiveit duncan-at-hiveit deleted the EES-6048-hotfix-revert-duckdb-dependency-version branch April 16, 2025 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants