Skip to content

Releases: koralium/flowtide

Version 0.14.0 alpha 4

12 May 12:53
58f08e1
Compare
Choose a tag to compare
Pre-release

What's Changed

  • Add IFileCacheFactory to allow different implementations of the cache by @Ulimo in #807
  • Add support for timestamp_add by @Ulimo in #813
  • Sqlstorage retry by @bpfz in #812
  • Add possibility to use custom destination table for bulk copy by @Ulimo in #814
  • Bug fix: Fix issue where statements where not downcasted to IDataValue for case when by @Ulimo in #816

Full Changelog: v0.14.0-alpha3...v0.14.0-alpha4

Version 0.14.0 alpha 3

29 Apr 14:03
038cb83
Compare
Choose a tag to compare
Pre-release

What's Changed

  • Fix so csv data source registers delta load after restart by @Ulimo in #801
  • Fix bug in generic read operator, wrong column count by @Ulimo in #802
  • Fix so generic data source returns typed information by @Ulimo in #803
  • Fix xml file column count when using extra columns by @Ulimo in #804
  • Move multiply to column based execution by @Ulimo in #805
  • Fix so window operator save pages on updates by @Ulimo in #806
  • Improve substring and trim performance by @Ulimo in #808
  • Add exchange operator with column support by @Ulimo in #653
  • Add better exceptions for dotnet type converter by @Ulimo in #809

Full Changelog: v0.14.0-alpha2...v0.14.0-alpha3

Version 0.14.0 alpha 2

23 Apr 14:46
1bffcc1
Compare
Choose a tag to compare
Pre-release

What's Changed

  • Add support for window functions, including sum and row_number by @Ulimo in #746
  • Bump image-size from 1.2.0 to 1.2.1 in /docs by @dependabot in #744
  • Bump next from 14.2.21 to 14.2.26 in /src/FlowtideDotNet.AspNetCore/ClientApp by @dependabot in #747
  • Fix bug when copying timestamp tz value to value container by @Ulimo in #748
  • Bug fix, fix native buffer writer ensure size by @Ulimo in #749
  • Add support for lead window function by @Ulimo in #757
  • Fix so window functions are run before any where statement by @Ulimo in #758
  • Add support for "greatest" scalar function by @Ulimo in #759
  • Add support for agg function surrogate_key_int64 by @Ulimo in #754
  • Add floor_timestamp_day to round down timestamps to day by @Ulimo in #760
  • Add support for check functions to check data quality by @Ulimo in #762
  • Add so rate outputs the initial value to give correct rate at the start of the execution by @Ulimo in #763
  • Bump estree-util-value-to-estree from 3.3.2 to 3.3.3 in /docs by @dependabot in #761
  • Add struct data type support by @Ulimo in #765
  • Add support for empty struct columns by @Ulimo in #766
  • Fix emit in window operator when deleting an entire row by @Ulimo in #767
  • Add support for list sort ascending null last by @Ulimo in #770
  • Add LAG window function by @Ulimo in #772
  • Add function list_first_difference by @Ulimo in #773
  • Fix rate calculation when missing historic data by @Ulimo in #774
  • Add functionality for reading from SQL views and tables without change tracking by @bpfz in #764
  • Add support for string_join function by @Ulimo in #777
  • Added new documentation for the sql server source by @bpfz in #776
  • Change window operator to allow multiple window functions by @Ulimo in #778
  • Fix struct column with null add and insert by @Ulimo in #781
  • Add support for last_value window function by @Ulimo in #780
  • Fix bug in emit optimizer for window function, add all fields as used if emit is not set by @Ulimo in #775
  • Add text files connector by @Ulimo in #782
  • Add timestamp_parse function by @Ulimo in #787
  • Add support for FILTER(WHERE ...) on aggregates by @Ulimo in #788
  • Add support for min_by and max_by aggregate functions by @Ulimo in #790
  • Bug fixes to validity list initialization and min max by type by @Ulimo in #791
  • Fix so window sum work on decimal data type by @Ulimo in #792
  • Fix sql return types and solve some failing tests by @Ulimo in #794
  • Add support for min_by and max_by as window functions by @Ulimo in #795
  • Fix so column project cast to IDataValue if its not returned by @Ulimo in #796
  • Add timestamp extract function by @Ulimo in #798
  • Add timestamp format function and move some functions to column based functions by @Ulimo in #799
  • Add sql partition source by @bpfz in #800

Full Changelog: v0.14.0-alpha1...v0.14.0-alpha2

Version 0.14.0 alpha 1

03 Apr 07:47
582c75c
Compare
Choose a tag to compare
Pre-release

What's Changed

  • Refactor connection string handling for flexibility by @bpfz in #712
  • Bump prismjs from 1.29.0 to 1.30.0 in /docs by @dependabot in #710
  • [Snyk] Fix for 1 vulnerabilities by @Ulimo in #717
  • Bump axios from 1.7.7 to 1.8.3 in /src/FlowtideDotNet.AspNetCore/ClientApp by @dependabot in #713
  • Bump @babel/runtime-corejs3 from 7.26.0 to 7.26.10 in /docs by @dependabot in #714
  • Bump @babel/runtime from 7.25.6 to 7.26.10 in /docs by @dependabot in #715
  • Bump @babel/helpers from 7.25.6 to 7.26.10 in /docs by @dependabot in #716
  • [Snyk] Upgrade next from 14.2.21 to 14.2.24 by @Ulimo in #697
  • Refactor compute tests to use xunit test framework instead of source generator by @Ulimo in #718
  • [Docs] Upgrade docusaurus and recreate lock file by @Ulimo in #720
  • [Bugfix] Fix insert range from basic column for union column by @Ulimo in #721
  • [Snyk] Upgrade docusaurus-lunr-search from 3.6.0 to 3.6.1 by @Ulimo in #722
  • Refactor code cleanup by @bpfz in #725
  • Add a new test framework to enable easier integration testing by @Ulimo in #724
  • Add string_split function for substring manipulation by @bpfz in #727
  • Add support for list_union_distinct_agg function by @Ulimo in #726
  • Add regexp_string_split function for regex-based splitting by @bpfz in #730
  • Refactor assertion property name in StringFunctionTests by @bpfz in #731
  • Add to_json function to convert an object to a json string by @Ulimo in #732
  • Add from_json function to convert json into flowtide objects by @Ulimo in #733
  • Add support for options data source from configuration by @Ulimo in #734
  • Add versioning support to DataflowStreamBuilder by @bpfz in #735
  • Add sql storage versioning by @bpfz in #737
  • Add lookup row method to generic data source by @Ulimo in #738
  • Add optional backward iteration support on the B+ tree by @Ulimo in #739
  • Add element methods to list column to more easily allow modifying list elements by @Ulimo in #741
  • Add possibility to serialize and deserialize data columns only by @Ulimo in #742
  • Add queue structure to state management for operators by @Ulimo in #743

Full Changelog: v0.13.1...v0.14.0-alpha1

Version 0.13.1

11 Mar 21:40
5873845
Compare
Choose a tag to compare

What's Changed

  • Bug fix: Fix case where a post join condition could cause an early exit of the loop by @Ulimo in #711

Full Changelog: v0.13.0...v0.13.1

Version 0.13.0

10 Mar 14:55
2fd22b5
Compare
Choose a tag to compare

Major changes

New serializer to improve serialization speed

A new custom serializer has been implemented that follows the Apache Arrow serialization while minimizing extra allocations and memory copies.

Additionally, the default compression method was also changed from using ZLib to Zstd.
This change was also made to improve serialization performance.

Support for pause & resume

A new feature has been added to allow pausing and resuming data streams, making it easier to conduct maintenance or temporarily halt processing without losing state.

For more information, visit https://koralium.github.io/flowtide/docs/deployment/pauseresume.

Integer column changed from 64 bits to dynamic size

The integer column was changed to now instead select the bit size based on the data inside of the column.
This change reduces memory usage for columns with smaller integer values. Bit size is determined on a per-page basis, so pages with larger values will only use higher bit sizes when necessary.

Delta Lake Support

This version adds support to both read and write to the Delta Lake format. This allows easy integration
to data lake storage. To learn more about delta lake support, please visit: https://koralium.github.io/flowtide/docs/connectors/deltalake

Custom data source & sink changed to use column based events

Both the custom data source and sink have now been changed to use column based events.
This improves connector performance by eliminating the need to convert data between column-based and row-based formats during streaming.

Minor changes

Elasticsearch connector change from Nest to Elastic.Clients.Elasticsearch

The Elasticsearch connector has been updated from the deprecated Nest package to Elastic.Clients.Elasticsearch. This change requires stream configurations to be adjusted for the new connection settings.

Additionally, connection settings are now provided via a function, enabling dynamic credential management, such as rolling passwords.

Add support for custom stream listeners

Applications can now listen to stream events like checkpoints, state changes, and failures, allowing for custom exit strategies or monitoring logic.

Example:

.AddCustomOptions(s =>
{
    s.WithExitProcessOnFailure();
});

Cache lookup table for state clients

An internal optimization adds a small lookup table for state client page access, reducing contention on the global LRU cache. This change has shown a 10–12% performance improvement in benchmarks.

What's Changed

  • Add custom arrow serializer to help improve serialization speeds by @Ulimo in #670
  • Add support to pause and resume a stream by @Ulimo in #674
  • Add new event listener abstractions and error listeners for killing application by @bpfz in #680
  • Add support to create object state from state manager client by @Ulimo in #679
  • Change serializers and storage interfaces to use IBufferWriter and ReadOnlyMemory by @Ulimo in #672
  • Remove storing state in the checkpoint event by @Ulimo in #681
  • Add logo and diagram to readme by @Ulimo in #684
  • Add logo and diagram by @Ulimo in #685
  • Change int64 column to be dynamic integer column by @Ulimo in #687
  • Add batch converter to and from dotnet objects by @Ulimo in #688
  • Change test mock source and sink to use column based format, also fix small bugs that occured from it by @Ulimo in #689
  • Update generic data source and sink to use column based data by @Ulimo in #690
  • Add types for stream notifications by @bpfz in #683
  • Add support for Delta Lake Source by @Ulimo in #691
  • Add initial version of the delta lake sink by @Ulimo in #692
  • [DeltaLake] Improve performance with deletion vectors by @Ulimo in #693
  • Upgrade packages in cosmosdb and delta lake by @Ulimo in #694
  • [Elasticsearch] Change from nest to new nuget by @Ulimo in #695
  • [DeltaLake] Fix so new columns are read as null when schema is evolved for old data files by @Ulimo in #698
  • [DeltaLake] Ignore compacted entries in the delta log by @Ulimo in #700
  • [SQL] Upgrade sql parser nuget to 0.6.3 by @Ulimo in #701
  • [MongoDB] Upgrade to 3.2.1 driver version by @Ulimo in #696
  • [Bugfix] Add that the tree is commited in grouped write operator by @Ulimo in #702
  • Add possibility to set max page count on storage by @Ulimo in #704
  • Add missing license headers to files by @Ulimo in #705
  • Add a cache table for pages in state client to help improve performance by @Ulimo in #707
  • Change so compression memory allocation shows up in metrics by @Ulimo in #708
  • Preperation of release 0.13.0 by @Ulimo in #709

Full Changelog: v0.12.0...v0.13.0

Version 0.13.0 alpha 3

06 Mar 09:53
a9820f6
Compare
Choose a tag to compare
Pre-release

What's Changed

  • [SQL] Upgrade sql parser nuget to 0.6.3 by @Ulimo in #701
  • [MongoDB] Upgrade to 3.2.1 driver version by @Ulimo in #696
  • [Bugfix] Add that the tree is commited in grouped write operator by @Ulimo in #702

Full Changelog: v0.13.0-alpha2...v0.13.0-alpha3

Version 0.13.0 alpha 2

05 Mar 13:00
f3b1c92
Compare
Choose a tag to compare
Pre-release

What's Changed

  • [DeltaLake] Improve performance with deletion vectors by @Ulimo in #693
  • Upgrade packages in cosmosdb and delta lake by @Ulimo in #694
  • [Elasticsearch] Change from nest to new nuget by @Ulimo in #695
  • [DeltaLake] Fix so new columns are read as null when schema is evolved for old data files by @Ulimo in #698
  • [DeltaLake] Ignore compacted entries in the delta log by @Ulimo in #700

Full Changelog: v0.13.0-alpha1...v0.13.0-alpha2

Version 0.13.0 alpha 1

27 Feb 10:13
ad9caeb
Compare
Choose a tag to compare
Pre-release

What's Changed

  • Add custom arrow serializer to help improve serialization speeds by @Ulimo in #670
  • Add support to pause and resume a stream by @Ulimo in #674
  • Add new event listener abstractions and error listeners for killing application by @bpfz in #680
  • Add support to create object state from state manager client by @Ulimo in #679
  • Change serializers and storage interfaces to use IBufferWriter and ReadOnlyMemory by @Ulimo in #672
  • Remove storing state in the checkpoint event by @Ulimo in #681
  • Add logo and diagram to readme by @Ulimo in #684
  • Add logo and diagram by @Ulimo in #685
  • Change int64 column to be dynamic integer column by @Ulimo in #687
  • Add batch converter to and from dotnet objects by @Ulimo in #688
  • Change test mock source and sink to use column based format, also fix small bugs that occured from it by @Ulimo in #689
  • Update generic data source and sink to use column based data by @Ulimo in #690
  • Add types for stream notifications by @bpfz in #683
  • Add support for Delta Lake Source by @Ulimo in #691
  • Add initial version of the delta lake sink by @Ulimo in #692

Full Changelog: v0.12.0...v0.13.0-alpha1

Version 0.12.0

24 Jan 15:43
5c836d1
Compare
Choose a tag to compare

Major changes

All Processing Operators Updated to Column-Based Events

All processing operators now use the column-based event format, leading to better performance.
However, some sources and sinks for connectors still use the row-based event format.
Additionally, a few functions continue to rely on the row-based event format.

MongoDB Source Support

This release adds support to read data from MongoDB, this includes using
MongoDBs change stream to directly react on data changes.

SQL Server Support for Stream State Persistence

You can now store the stream state in SQL Server. For setup instructions, refer to the documentation:
https://koralium.github.io/flowtide/docs/statepersistence#sql-server-storage

Timestamp with Time Zone Data Type

A new data type for timestamps has been added.
This ensures that connectors can correctly use the appropriate data type, especially when writing.
For example, writing to MongoDB now uses the BSON Date type.

Minor Changes

Virtual Table Support

Static data selection is now supported. Example usage:

INSERT INTO output 
SELECT * FROM 
(
  VALUES 
  (1, 'a'),
  (2, 'b'),
  (3, 'c')
)

What's Changed

  • Add better exception on persistent storage read failure by @Ulimo in #586
  • Bump cross-spawn from 7.0.3 to 7.0.6 in /src/FlowtideDotNet.AspNetCore/ClientApp by @dependabot in #589
  • Add mongoDB source by @Ulimo in #588
  • Add new set operator based on column events by @Ulimo in #593
  • Refactor buffer operator to use column data structure instead by @Ulimo in #590
  • Fix bug in spicedb and fix tests by @Ulimo in #594
  • Fix failing test in substrait by @Ulimo in #595
  • Refactor timestamp provider to use column based events by @Ulimo in #603
  • Refactor table function operator to use column based events by @Ulimo in #608
  • Remove unwrap operator and relation by @Ulimo in #610
  • Fix bugs in aggregate operator and null column by @Ulimo in #615
  • Fix bug where a variable was reused and could contain the wrong information by @Ulimo in #617
  • Add support to run substrait test files to check compliance with substrait functions by @Ulimo in #618
  • Fix bug where emit with brackets caused an exception by @Ulimo in #619
  • Add substrait test for substring and fix compliance with specification by @Ulimo in #620
  • Add support for aggregate substrait tests with count, min and max to start by @Ulimo in #621
  • Add substrait tests for concat trims and upper by @Ulimo in #622
  • Add final substrait tests for implemented string functions by @Ulimo in #623
  • Fix bug where a union containing map or list failed deserialization by @Ulimo in #624
  • push emit list through buffer by @Ulimo in #626
  • Add support for strpos by @Ulimo in #627
  • Remove seed script from sql server to sql server sample by @Ulimo in #628
  • Change to use concurrent dictionary in frontend cache by @Ulimo in #625
  • Refactor TopN to use column based events by @Ulimo in #611
  • Add substrait tests for comparison and boolean functions by @Ulimo in #629
  • Add column based rounding functions by @Ulimo in #630
  • Nested Block Join refactor and right and full outer support for joins by @Ulimo in #616
  • Fix acceptance tests for set operator and add debug logging for block join by @Ulimo in #634
  • add virtual table operator by @Ulimo in #632
  • Validate that stop types exist in the zanzibar schema when converting by @Ulimo in #635
  • Add elasticsearch column based sink by @Ulimo in #567
  • Add support to use concat(..) in addition to || by @Ulimo in #638
  • Add storage sqlserver by @bpfz in #591
  • Add Sql Server Storage solution to nuget release by @Ulimo in #645
  • Refactor mongodb sink to use column based format by @Ulimo in #640
  • Add support for serializers to write and read metadata pages by @Ulimo in #643
  • Change spicedb connector to use column based events by @Ulimo in #647
  • Add test case for mongodb where primary key is not in position 0 by @Ulimo in #648
  • Add new data type column for timestamp with offset by @Ulimo in #646
  • Change sql server connector to use column based events in the sink by @Ulimo in #649
  • Update SQL Server storage documentation by @bpfz in #651
  • [Snyk] Security upgrade next from 14.2.13 to 14.2.21 by @Ulimo in #652
  • Bump nanoid from 3.3.7 to 3.3.8 in /src/FlowtideDotNet.AspNetCore/ClientApp by @dependabot in #641
  • Bump next from 14.2.13 to 14.2.21 in /src/FlowtideDotNet.AspNetCore/ClientApp by @dependabot in #650
  • Bump nanoid from 3.3.7 to 3.3.8 in /docs by @dependabot in #654
  • Bump FluentAssertions from 6.12.0 to 7.0.0 by @dependabot in #633
  • Add better logging for fasterkv storage where logger is injected by @Ulimo in #655
  • Fix timestamp size to include alignment, also fix timestamp in union by @Ulimo in #656
  • Bug fix: fix so save page saves page correctly if resize is required by @Ulimo in #657
  • Add that memory allocated by data columns are reduced if the usage is 2.5 less than allocated by @Ulimo in #660
  • Reenable the append tree and change metrics to use the append tree by @Ulimo in #661
  • Fix InsertAt in map with null values by @Ulimo in #663
  • change sharepoint source to retry on task cancelled by @Ulimo in #662
  • Add search to docs, fixes #659 by @Ulimo in #664
  • Fix failing tests for append tree and mongodb by @Ulimo in #665
  • Remove dependency on FluentAssertions by @Ulimo in #666
  • Bug fix: fix union deserialize special case with only null values by @Ulimo in #667
  • Add support for binary literals by @Ulimo in #668
  • Update for version 0.12.0 by @Ulimo in #669

Full Changelog: v0.11.2...v0.12.0