This is a DuckDB extension that adds support for reading files from within zip archives.
Load from the community extensions repository:
INSTALL zipfs FROM community;
LOAD zipfs;
To read a file:
SELECT * FROM 'zip://examples/a.zip/a.csv';
To read a file from azure blob storage (or other file system):
SELECT * FROM 'zip://az://yourstorageaccount.blob.core.windows.net/yourcontainer/examples/a.zip/a.csv';
File names passed into the zip://
URL scheme are expected to end with .zip
, which indicates the end of the zip file name. The path after
that is taken to be the file path within the zip archive.
Globbing within the zip archive is supported, but see below for performance limitations. A glob query looks like:
SELECT * FROM 'zip://examples/a.zip/*.csv';
Globbing for multiple zip files:
SELECT * FROM 'zip://examples/*.zip/*.csv';
You may use options to turn this behavior off and instead choose some string to split on:
SET zipfs_split = "!!";
SELECT * FROM 'zip://examples/a.zip!!b.csv'
This extension is intended more for convience than high performance. It does not implement a file metadata cache as tarfs
(on which this
extension is based) does. As such, operations which require the central directory (index) of the zip file, such as globbing files, must
reread the central directory multiple times, once for the glob and once for each file to open.
First, install vcpkg to vcpkg
:
git clone https://github.com/Microsoft/vcpkg.git
./vcpkg/bootstrap-vcpkg.sh
export VCPKG_TOOLCHAIN_PATH=`pwd`/vcpkg/scripts/buildsystems/vcpkg.cmake
Then:
make -j 4 release
make test_release
duckdb-zipfs Copyright 2025 Isaac Brodsky. Licensed under the MIT License.
DuckDB Copyright 2018-2022 Stichting DuckDB Foundation (MIT License)
miniz Copyright 2013-2014 RAD Game Tools and Valve Software Copyright 2010-2014 Rich Geldreich and Tenacious Software LLC (MIT License)
DuckDB extension-template Copyright 2018-2022 DuckDB Labs BV (MIT License)
duckdb_tarfs (MIT license)