Skip to content

Allow parallel encoding of multiple columns of single parquet file. #507

Answered by adamreeve
avakaluk asked this question in Q&A
Discussion options

You must be logged in to vote

Hi, that discussion is for the Rust Parquet library, but ParquetSharp uses the C++ library internally. The Parquet C++ library and therefore ParquetSharp does support multi-threaded writing although it's not something I've experimented much with. If you're writing Arrow data, you can enable multi-threaded writing in the ArrowWriterProperties.

Otherwise you'll need to write in buffered mode, using AppendBuffferedRowGroup. Then you can get column writers that can be written to concurrently with the Column(int) method.

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@avakaluk
Comment options

@avakaluk
Comment options

@adamreeve
Comment options

@avakaluk
Comment options

Answer selected by avakaluk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants