-
Notifications
You must be signed in to change notification settings - Fork 2.3k
[Backport 2.x] Refactor multipart download to a more async model #10373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Backport 2.x] Refactor multipart download to a more async model #10373
Conversation
…#10349) * Refactor read context streams to async streams Signed-off-by: Kunal Kotwani <[email protected]> * Refactor multipart download to a more async model The previous approach of kicking off the stream requests for all parts of a file did not work well for very large files. For example, a 20GiB file uploaded in 16MiB parts will consist of 1200+ parts. When we attempted to initiate streaming for all parts concurrently, some parts would hit a client timeout after 2 minutes without being able to get a connection due to the other parts not having been completed in that time frame. This refactoring adds yet another layer of indirection in order to allow the code that is actually writing the destination file to control the rate at which streams are started. This should allow for downloading files consisting of arbitrarily many parts at any connection speed. This commit also wires in the download rate limiter so that the `indices.recovery.max_bytes_per_sec` is properly honored. Signed-off-by: Andrew Ross <[email protected]> --------- Signed-off-by: Kunal Kotwani <[email protected]> Signed-off-by: Andrew Ross <[email protected]> Co-authored-by: Kunal Kotwani <[email protected]> (cherry picked from commit 28f185b)
Compatibility status:Checks if related components are compatible with change ff38621 Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/reporting.git] |
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## 2.x #10373 +/- ##
============================================
+ Coverage 70.78% 70.87% +0.09%
- Complexity 58370 58392 +22
============================================
Files 4818 4816 -2
Lines 275947 275950 +3
Branches 40554 40559 +5
============================================
+ Hits 195318 195580 +262
+ Misses 63971 63732 -239
+ Partials 16658 16638 -20
|
Backports 28f185b from #10349
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.