[Feature Request] HA Tracker support for multiple prometheus replicas in the same batch

**Is your feature request related to a problem? Please describe.**
The HA Tracker mechanism Cortex provides, is based on a Prometheus only remote write, where each batch is from a separate replica. In our case, we have multiple datapoints coming from multiple producers, mixed in a remote written batch. Some datapoints can be sourced from Prometheus HA pair while others might be sourced from other systems.
[Distributor HA Tracker implementation](https://github.com/cortexproject/cortex/blob/master/pkg/distributor/distributor.go#L655) will look for the prometheus replica label only in the first datapoint from the batch and assumes that other datapoints from that batch are from the same prometheus replica.
Thus, we have 3 scenarios where the [FIRST](https://github.com/cortexproject/cortex/blob/master/pkg/distributor/distributor.go#L656) datapoint from the batch:

1. [Doesn’t have promethues replica or cluster labels](https://github.com/cortexproject/cortex/blob/master/pkg/distributor/distributor.go#L522) - the batch will be pushed
2. [Has prometheus replica label](https://github.com/cortexproject/cortex/blob/master/pkg/ha/ha_tracker.go#L389):
2.1. If it is the same as the elected leader replica selected and stored in kv store the batch will be pushed
2.2. If its NOT the same as the elected leader replica selected and stored in kv store the batch will **not** be pushed

**Describe the solution you'd like**
Maybe apply the same mechanism of the ha tracker after the batch is separated into smaller batches for each (cluster, replica) pairs. Instead of calling the findHALabels method, if HA Tracker is enabled, add a method to separate these batches and then iterate through them and discard only the smaller batch from the replica which is not in the kv store.

**Describe alternatives you've considered**
It is possible on our services implementation to segregate datapoints and create HA pairs specific batches. We will have dedicated batches for non HA pairs sourced datapoints and for each HA pair.
The following diagram represents the solution from our side where **T1** and **T2** represent cluster labels, **a** and **b** replica labels (as in [the official ha tracker docs](https://cortexmetrics.io/docs/guides/ha-pair-handling/#context)) and s1 and s2 different samples for each of these series. 
![cortex_ha_tracker_segregation drawio](https://github.com/user-attachments/assets/e5798a35-e3bf-4142-b67e-7ee93be63959)

**Additional context**
Add any other context or screenshots about the feature request here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] HA Tracker support for multiple prometheus replicas in the same batch #6256

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] HA Tracker support for multiple prometheus replicas in the same batch #6256

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions