Skip to content

Performance Degradation in Python SDK (v3.18.2) When Reading High-Load YDB Topic #670

Open
@krasnovdm

Description

@krasnovdm

Bug Report

YDB Python SDK version:
3.18.2

Environment:

  • 24 parallel reader processes.
  • Topic with 2000 partitions
  • High write load (spikes of 40k+ messages/sec).

Current behavior:

  • Read throughput drops by ~33% (from ~30k to ~20k messages/sec) during high write loads.
  • Performance degradation persists until write load stabilizes.
  • No issues observed with the C++ SDK under identical conditions.

Expected behavior:

  • Python SDK should maintain stable read throughput (~30k messages/sec) regardless of write load, matching the C++ SDK’s performance.

Steps to reproduce:

  1. Configure a YDB topic with 2000+ partitions.
  2. Use Python SDK (v3.18.2) to read the topic with 24 parallel processes.
  3. Generate a sustained high write load (e.g., 40k+ messages/sec).
  4. Observe read throughput degradation in Python SDK while the C++ SDK remains stable.

Related code:

# Reader initialization  
async def init_client(self):  
    reader = self._ydb_driver.topic_client.reader(  
        topic=ydb.TopicReaderSelector(path=self._topic, partitions=self._partition_groups),  
        consumer=self._consumer_id,  
    )  
    self._reader = reader  

# Batch processing  
async def read_topic_partition(self) -> typing.List[typing.Tuple[datatypes.PublicMessage, int]]:  
    ...  
    events_batch = await asyncio.wait_for(  
        self._reader.receive_batch(max_messages=self._max_messages),  
        timeout=_CLIENT_TIMEOUT,  
    )  
    ...  

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions