Description
The camel-azure-eventhubs-source-kafka-connector currently requires configuration of the Azure Blob Storage, used for storing the checkpoints.
I think there should be an option to store checkpoints in the Kafka Connect offset topic. The source connector state is usually stored there by including the source offset into the SourceRecord
with the given source partition used as key for lookup.
The checkpoints can then be loaded on the task start using the OffsetStorageReader
without need for external storage.
As far as I understand, the related Camel Component allows to set the a custom CheckpointStore
bean instance to be used by the Azure SDK client. So it should be possible to provide the CheckpointStore
implementation based on Kafka Connect API and use it instead of the BlobCheckpointStore
.
I was thinking of doing so with a PR but I'm not sure how to deal with the Kamelet, which currently requires that all parameters related to the Blob Storage are set.
In general I see two possible options:
Option A. Make Azure Blob parameters in the Kamelet optional
In this option, the CamelAzureeventhubssourceSourceTask
can be adopted to initialize CheckpointStore
based on the offsets if the Blob storage is not configured.
The CamelSourceTask
could be extended to provide option to customize the created Endpoint
so the initialized CheckpointStore
by the task on start can be set via EventHubsEndpoint
e.g.
eventHubsEndpoint.getConfiguration().setCheckpointStore(checkpointStore)
Option B. Add new Kamelet variant for Azure EventHubs source without parameters for Blob storage
This options works similarly to Option A, just as a dedicated connector and dedicated Kamelet descriptor.
What do you think about the proposed solutions ?