Improve numeric arg handling and add dynamic worker spawning #56

jkool702 · 2025-01-16T19:15:25Z

Summary by Sourcery

Add dynamic worker spawning and improve numeric argument handling.

Enhancements:

Improve handling of numeric arguments by allowing standard prefixes (k, ki, M, etc.).
Generalize lseek builtin to support aarch64 and riscv64 architectures in addition to x86_64.
Refactor dynamic worker spawning logic to improve performance and reliability under varying workloads.

Tests:

Update unit tests to cover new argument handling and dynamic worker spawning features.

apply _forkrun_getVal to all user-provided numeric options so that any user-specified number can use a SI/IEC prefix

Fix _forkrun_getVal usage

Update forkrun.bash

make bash completions faster

Completely reworked the logic for how coprocs are dynamically spawned. Other various minor changes as well.

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Update forkrun.bash

sourcery-ai · 2025-01-16T19:15:30Z

Reviewer's Guide by Sourcery

This pull request enhances numeric argument handling in forkrun and introduces dynamic worker process spawning to optimize CPU usage. It also includes changes to improve the performance of the lseek builtin and adds support for aarch64 and riscv64 architectures.

State diagram for dynamic worker spawning

stateDiagram-v2
    [*] --> Initial
    Initial --> Monitoring: Start with nProcs workers
    Monitoring --> SpawnWorker: System load < threshold AND
Queue depth < minimum
    SpawnWorker --> Monitoring: Update worker count
    Monitoring --> CheckLoad: Periodic check
    CheckLoad --> SpawnWorker: Load allows more workers
    CheckLoad --> Monitoring: Load too high
    Monitoring --> [*]: Quit signal
    note right of SpawnWorker
        New workers added based on:
        - CPU load
        - Queue depth
        - Current worker count
    end note

File-Level Changes

Change	Details	Files
Improved numeric argument parsing to handle a wider range of inputs with optional suffixes.	Modified option parsing to accept numbers with optional SI (k, M, G, etc.) and IEC (Ki, Mi, Gi, etc.) suffixes. Added support for numbers with a '+' prefix. Updated the logic to convert suffixed numbers to their numeric values. Removed the need for a separate parser for byte arguments.	`forkrun.bash`
Added dynamic worker process spawning to optimize CPU usage based on system load and queue depth.	Implemented a new coproc to monitor system load and adjust the number of worker coprocs. Added logic to dynamically spawn new worker coprocs when the read queue depth is low or system load is below a target threshold. Added logic to dynamically reduce the target system load when new coprocs cause system load to drop. Added logic to dynamically adjust the number of new coprocs to spawn based on the current worker count. Added logic to dynamically adjust the number of new coprocs to spawn based on the time it takes to read data from stdin vs the time it takes to process that data. Added a new file descriptor to communicate between the main process and the dynamic worker spawning coproc. Added a new file to store the number of worker coprocs. Modified the logic to automatically adjust the number of lines read from stdin based on the current read queue depth. Added logic to prevent the dynamic worker spawning coproc from spawning more coprocs than the maximum allowed. Added logic to prevent the dynamic worker spawning coproc from spawning new coprocs when the main process is exiting.	`forkrun.bash`
Improved the performance of the lseek builtin and added support for aarch64 and riscv64 architectures.	Added support for aarch64 and riscv64 architectures. Added an optional argument to specify the seek type (SEEK_SET, SEEK_CUR, SEEK_END). Modified the lseek_compile.sh script to compile the lseek builtin for aarch64 and riscv64 architectures. Modified the lseek_compile.sh script to install the lseek builtin in a new directory.	`lseek_builtin/lseek.c` `lseek_builtin/lseek_compile.sh`
Updated the hyperfine benchmark script to use null delimiters and test a wider range of input sizes.	Added support for null delimiters. Added logic to test a wider range of input sizes. Added logic to dynamically generate the list of input sizes to test. Added logic to use a ramdisk for the test files. Added logic to use a null delimiter when reading input from a file.	`hyperfine_benchmark/forkrun.speedtest.hyperfine.bash`
Updated the README to reflect the changes made in this pull request.	Updated the version number. Added a changelog entry for the new features. Updated the description of the dynamic worker process spawning feature. Updated the description of the numeric argument parsing feature. Updated the description of the lseek builtin.	`README.md`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time. You can also use
this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

makes a few variables local they had not yet been made local. fix minor mistakes in documentation made pulling file descriptor byte offset from procfs slightly more efficient

…of arguments that forkrun will process before returning.

add new `-n` flag into help

Add -n option to limit input lines

Fix option parsing if options have whitespace characters

use `head` to speed up `-n <#>` flag when possible + fix whitespace option parsing

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

jkool702 · 2025-01-23T14:05:20Z

@sourcery-ai review

sourcery-ai

Hey @jkool702 - I've reviewed your changes and they look great!

Here's what I looked at during the review

🟡 General issues: 3 issues found
🟢 Security: all looks good
🟢 Testing: all looks good
🟡 Complexity: 1 issue found
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-01-23T14:07:25Z

lseek_builtin/lseek.c

-    // check for exactly 2 args passed to lseek
-    if (argc != 3) {
-        fprintf(stderr, "\nIncorrect number of arguments.\nUSAGE: lseek <FD> <REL_OFFSET>\n");
+    int whence = SEEK_CUR


issue (bug_risk): Remove duplicate whence variable declaration to prevent undefined behavior

sourcery-ai · 2025-01-23T14:07:25Z

README.md

+1. the logic by which coprocs are dynamically spawned has been completely rewritten to improve performance and reliability when handling varying workloads.
+2. all numeric commandline arguments now accept standard prefixes (k=1000, ki=1024, M=1000000, etc.)
+3. the lseek loadable builtin has been recompiled and now can be used on x86_64 and aarch64 and riscv64 architectures
+4. **BREAKING CHANGE**: the `-n` flag (which previously added ordering infornmation to the output and implied `-k`) has been renamed to `-K`. The `-n` flag now implements a new feature - limiting the otal number of lines that `forkrun` will process. `... | forkrun -n <#> ...` is basically equivilant to `... | head -n <#> | forkrun ...` (except that, unlike `head`, `forkrun -n <#> -d _` allows for this functionality with delimiters other than NULLs and newlines).


question: Clarify the behavior of the -K flag.

The changelog mentions that -n was renamed to -K. Does -K now provide the old functionality of -n (adding ordering information and implying -k) in addition to its new functionality? Please clarify.

README.md

sourcery-ai · 2025-01-23T14:07:26Z

forkrun.bash

@@ -8,7 +8,7 @@ forkrun() {
 #
 # USAGE: printf '%s\n' "${args[@]}" | forkrun [-flags] [--] parFunc ["${args0[@]}"]
 #
-# LIST OF FLAGS: [-j|-P [-]<#>[,<#>,<#>]] [-t <path>] ( [-l <#>] | [-L <#[,#]>]] ) ( [-b <#>] | [-B <#>[,<#>]] ) [-d <char>] [-u <fd>]  [-i] [-I] [-k] [-n] [-z|-0] [-s] [-S] [-p] [-D] [-N] [-U] [-v] [-h|-?]
+# LIST OF FLAGS: [-j|-P [-]<#>[,<#>,<#>]] [-t <path>] ( [-l <#>] | [-L <#[,#]>]] ) [-n <#>] ( [-b <#>] | [-B <#>[,<#>]] ) [-d <char>] [-u <fd>]  [-i] [-I] [-k] [-K] [-z|-0] [-s] [-S] [-p] [-D] [-N] [-U] [-v] [-h|-?]


issue (complexity): Consider refactoring the dynamic worker spawning logic to separate worker management into a dedicated module and simplify the scaling algorithm using clearer thresholds.

The dynamic worker spawning implementation has grown overly complex. Consider refactoring to separate concerns:

Extract worker management into a dedicated module:

# worker_manager.sh manage_workers() { local target_workers min_workers max_workers local current_load queue_depth while true; do # Get metrics current_load=$(get_system_load) queue_depth=$(get_queue_depth) # Simple scaling decision if (( current_load < load_threshold && queue_depth > min_queue )); then add_workers "$target_workers" fi sleep "$check_interval" done }

Simplify the scaling algorithm:

calculate_target_workers() { local load=$1 local queue=$2 # Simple linear scaling based on load if (( load < 70 )); then echo $(( current_workers + 1 )) else echo "$current_workers" fi }

This separates monitoring from control logic and uses simpler thresholds instead of complex weighted averages. The functionality remains the same but the code becomes more maintainable.

The problem with this is that spawning more threads doesnt guarantee increasing system load, and in those cases it would always result in spawning the max number even though far fewer were needed. For example

{ for nn in {1..1000}; do echo ${nn}; sleep 1; done; } | forkrun echo

clearly this doesnt need many workers, but your simplified logic would spawn the maximum number allowed for it and all but 1 would just sit by idling the whole time.

You're right, @jkool702, my suggested simplification was too naive and wouldn't handle I/O-bound tasks like your example efficiently. Spawning more workers wouldn't increase CPU load in such cases, leading to unnecessary worker creation. Thanks for pointing that out!

I've reconsidered the approach. Instead of solely relying on CPU load, we should incorporate a metric that reflects the actual work being done by the workers. One option is to track the number of active workers. We can then adjust the scaling logic to consider both CPU load and the number of currently active workers.

Here's a revised suggestion for the calculate_target_workers function:

calculate_target_workers() { local load=$1 local queue=$2 local active_workers=$(get_active_worker_count) local max_workers=$(( $(nproc) * 2 )) # Example: Double the number of cores # Scale up if load is low and not all potential workers are active if (( load < 70 && active_workers < max_workers )); then echo $(( current_workers + 1 )) # Scale down if load is high and there are more than the minimum workers elif (( load > 80 && current_workers > min_workers )); then echo $(( current_workers - 1 )) else echo "$current_workers" fi }

This version introduces a few improvements:

Active worker count: get_active_worker_count (which would need to be implemented) provides a measure of how many workers are actively processing tasks. This helps prevent over-spawning in I/O-bound scenarios.

Max workers: A max_workers limit is introduced (calculated based on nproc in this example, but could be configurable). This prevents unbounded worker creation.

Scale down logic: The function now also includes logic to scale down the number of workers if the load is high and there are more than the minimum required workers.

This approach still maintains a degree of simplicity while addressing the issue you raised. It allows for more controlled scaling based on both system load and actual worker activity. What are your thoughts on this revised approach?

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

improve unit testing script

jkool702 · 2025-01-29T14:20:12Z

@sourcery-ai review

sourcery-ai

Hey @jkool702 - I've reviewed your changes - here's some feedback:

Overall Comments:

The dynamic worker spawning logic has been significantly improved to better handle varying workloads by considering both CPU usage and queue depth metrics. This should provide more efficient parallelization.
Breaking change: The -n flag has been renamed to -K for output ordering info. The new -n flag now limits total lines processed, consistent with Unix conventions like head -n.

Here's what I looked at during the review

🟡 General issues: 4 issues found
🟢 Security: all looks good
🟢 Testing: all looks good
🟡 Complexity: 1 issue found
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-01-29T14:24:01Z

README.md

+`forkrun`'s usage and syntax are very similar to `xargs` and `parallel`.  However, <sup>1</sup>. `forkrun`:
+
+* offers more features than `xargs` and is approximately equal in speed to the fastest `xargs` invocation<sup>2</sup>. (`forkrun` without any flags is functionally equivalent to `xargs -P $*(nproc) -d $'\n'`),
+* is considerably faster than `parallel` (over an order of magnitude faster in some cases) <sup>2</sup>, while still supporting many of the particularly useful "core" `parallel` features,


issue (typo): Typo: "equivilant" should be "equivalent"

README.md

sourcery-ai · 2025-01-29T14:24:02Z

README.md

+**forkrun v1.5**: major changes include:
+1. the logic by which coprocs are dynamically spawned has been completely rewritten to improve performance and reliability when handling varying workloads.
+2. all numeric commandline arguments now accept standard prefixes (k=1000, ki=1024, M=1000000, etc.)
+3. the lseek loadable builtin has been recompiled and now can be used on x86_64 and aarch64 and riscv64 architectures


suggestion (typo): Grammar: Use commas for better readability

"x86_64 and aarch64 and riscv64" could be improved to "x86_64, aarch64, and riscv64".

Suggested change

3. the lseek loadable builtin has been recompiled and now can be used on x86_64 and aarch64 and riscv64 architectures

3. the lseek loadable builtin has been recompiled and now can be used on x86_64, aarch64, and riscv64 architectures

README.md

sourcery-ai · 2025-01-29T14:24:02Z

forkrun.bash

@@ -8,7 +8,7 @@ forkrun() {
 #
 # USAGE: printf '%s\n' "${args[@]}" | forkrun [-flags] [--] parFunc ["${args0[@]}"]
 #
-# LIST OF FLAGS: [-j|-P [-]<#>[,<#>,<#>]] [-t <path>] ( [-l <#>] | [-L <#[,#]>]] ) ( [-b <#>] | [-B <#>[,<#>]] ) [-d <char>] [-u <fd>]  [-i] [-I] [-k] [-n] [-z|-0] [-s] [-S] [-p] [-D] [-N] [-U] [-v] [-h|-?]
+# LIST OF FLAGS: [-j|-P [-]<#>[,<#>,<#>]] [-t <path>] ( [-l <#>] | [-L <#[,#]>]] ) [-n <#>] ( [-b <#>] | [-B <#>[,<#>]] ) [-d <char>] [-u <fd>]  [-i] [-I] [-k] [-K] [-z|-0] [-s] [-S] [-p] [-D] [-N] [-U] [-v] [-h|-?]


issue (complexity): Consider extracting the worker spawn count calculation into a separate function to simplify the main loop and improve readability.

The dynamic worker spawning logic could be simplified by extracting the spawn decision into a separate function. This would improve maintainability while preserving the core functionality. Here's a suggested refactoring:

_forkrun_calc_spawn_count() { local -i pLoad=$1 pLoadMax=$2 kkProcs=$3 nProcsMax=$4 pAdd # Calculate base number of workers to add based on system load pAdd=$(( ( pLoadMax - pLoad ) / pLoad1 )) # Apply limits (( pAdd < 1 )) && pAdd=0 (( pAdd > (nProcsMax - kkProcs) )) && pAdd=$(( nProcsMax - kkProcs )) # Scale down as we approach max workers pAdd=$(( ( ( 4 * nProcsMax ) - ( 3 * kkProcs ) ) * pAdd / ( ( 8 * nProcsMax ) + ( 3 * kkProcs ) ) )) echo "$pAdd" }

Then simplify the main loop:

while ! [[ -f "${tmpDir}"/.quit ]] && (( kkProcs < nProcsMax )); do # Get current load metrics mapfile -t pLOADA < <(_forkrun_get_load "${pLOADA0[@]}") # Skip if system load too high (( pLOADA > pLoad_max )) && continue # Calculate number of workers to add pAdd=$(_forkrun_calc_spawn_count "${pLOADA}" "${pLoad_max}" "${kkProcs}" "${nProcsMax}") # Spawn new workers if needed if (( pAdd > 0 )); then spawn_workers "${pAdd}" "${coprocSrcCode}" kkProcs+=pAdd echo "${kkProcs}" >"${tmpDir}"/.nWorkers fi done

This refactoring:

Extracts spawn count calculation into a focused function

Simplifies the main loop flow

Makes the code more maintainable

Preserves the core dynamic spawning behavior

So, let me explain a bit of why the "spawning new worker coprocs" decision tree is so complex.

The decision is really just based on 2 main ideas:

if "time for N lines to arrive on stdin" is equal to or less than "time for a coproc to process N lines * number of worker coprocs" then dont spawn more workers, since trhey will just sit idle

spawn workers in groups until you hit the max worker count or hit maximum achievable system load. determine group size based on an estimate of "load added per worker" and "distance betweeen current load and max attainable load"

complications with #1 are that thetime to process N lines depends on the batch size (e.g., calling sha1sum once and passing it 1000 filnames is much faster than calling ti 1000 times and passing it 1 filename each time).. So, the "average time to process N lines" only can use timing data for a specific batch size of N lines.

complications with #2 are that the "maximum attainable load" at the maximum coproc worker count will not always be near 100%. For example, calling

printf '1%.0s\n' {1..1000} | forkrun -j 100 sleep

on my system only took up, on average, 10% of a single cpu to run.

So, forkrun dynamically changes the "target max load" parameter. In particular, if spawning another group of coprocs doesnt actually increase system load, the target max load is lowered considerably. If the target max load was previously lowered and spawning a group of coprocs once again does increase total system load, the target max load is slightly raised.

Do you have any suggestions for simplifying the code while still addressing these complications?

Thanks for explaining the complexities of the worker spawning logic. I understand the challenges involved in estimating processing time and managing system load dynamically. My initial suggestion for extracting the spawn count calculation was aimed at improving readability and maintainability, but I see now that it doesn't fully address the dynamic nature of the pLoad_max adjustments.

Given the intricacies you've described, a simple extraction might not be sufficient. Instead of a single function, consider a more structured approach: encapsulate the entire worker spawning and load management logic within a dedicated class or module. This would allow you to:

Abstract the complexity: Hide the internal workings of load calculation, pLoad_max adjustments, and worker spawning within the class/module. This will make the main loop cleaner and easier to follow.

Improve testability: A separate class/module can be tested independently, ensuring the complex logic works correctly under various scenarios. You can mock system load and timing data to simulate different conditions.

Manage state effectively: The class/module can maintain internal state for pLoad_max, previous load measurements, and other relevant variables, making the logic more self-contained and less reliant on global variables.

Introduce more sophisticated algorithms: With a dedicated structure, you can potentially implement more advanced load balancing and worker spawning algorithms without cluttering the main loop. For example, you could explore using a feedback loop based on actual processing times to dynamically adjust the number of workers.

Here's a conceptual outline of how a class-based approach might look (using pseudo-code):

class WorkerSpawner: def __init__(self, max_workers, initial_load_target): # Initialize internal state def update_load_target(self, current_load, num_workers): # Implement logic to adjust pLoad_max def calculate_spawn_count(self, current_load): # Calculate number of workers to spawn based on current load and pLoad_max def spawn_workers(self, count): # Spawn the specified number of workers # In the main loop: spawner = WorkerSpawner(max_workers, initial_load_target) while ...: current_load = get_current_load() spawner.update_load_target(current_load, current_worker_count) spawn_count = spawner.calculate_spawn_count(current_load) spawner.spawn_workers(spawn_count) ...

This approach would require more refactoring, but it would significantly improve the organization, readability, and testability of the dynamic worker spawning logic. Let me know what you think of this approach, and I'm happy to discuss specific implementation details further.

Remember that this is written in bash, not in C. There arent native class definitions. That said...

encapsulate the entire worker spawning and load management logic within a dedicated class or module

It sort of already is. It isnt a "class" or a "module", but the entirety of the code for spawning workers is in its own coproc (pQueue) which runs in parallel with the main loop and which is only spawned when dynamic coproc spawning is enabled (via a cmdline option).

Regarding "introducing more sophisticated algorithms": the example i gave about running sleep with forkrun made me realize that I shoukd probably keep track of time to run N lines / num workers (i.e., the average time between 2 coprocs reading new data) and make sure that every time new workers are spawned that ratio is actually decreasing. This is tied to "how close we are to full system load" for cpu-bound tasks, but not necessarily for I/O bound tasks.

You're right, my apologies! I got carried away with the class analogy, forgetting we're working within bash's constraints. Thanks for reminding me. Using a separate coproc (pQueue) is a clever way to encapsulate the worker spawning logic within bash.

Your idea of tracking time to run N lines / num workers and ensuring it decreases when spawning new workers is excellent. This directly addresses the issue of I/O-bound tasks not necessarily reflecting load in CPU usage. It adds a more direct performance measurement into the feedback loop.

To integrate this new metric, you could modify the pQueue coproc to:

Measure and store: Record the time taken to process N lines before and after spawning a new group of workers.

Calculate the ratio: Compute time to run N lines / num workers for both measurements.

Adjust spawning behavior: If the ratio hasn't decreased (or hasn't decreased sufficiently), adjust the spawning strategy. This might involve reducing the number of workers in the next group, pausing spawning for a longer period, or even killing some existing workers if the ratio has increased significantly. You could also use this information to dynamically adjust pLoad_max as you mentioned earlier.

Since the pQueue coproc already handles the spawning logic, adding this timing-based feedback mechanism within it would be a natural extension. It keeps the related logic together and avoids cluttering the main loop.

Would you like me to help you brainstorm specific bash commands or logic for implementing this timing and ratio calculation within the pQueue coproc? I can also offer suggestions on how to best integrate this new feedback mechanism into your existing spawning logic.

I think ill implement this by replacing some of the existing scaling/reduction code for pAdd with

# t0 is avg time to run N lines with k0 worker coprocs # t1 is avg time to run N lines with k1 worker coprocs # k1 > k0 pAdd = $(( ( pAdd * ( t0 * k1 - t1 * k0 ) ) / ( t0 * ( k1 - k0 ) ) )) (( pAdd < 0 )) && pAdd=0

This makes it so that if adding more workers didnt increase avg worker runtime (t1 == t0) then there is no reduction in pAdd. If time increased proportionally to number of new coprocs (k1 - k0), meaning that the time to run N lines / num workersratio was the same with k0 and k1 workers, thenpAdd` goes down to 0 and no new workers get added.

Note: this is equivilant to

# r0 = t0 / k0 # r1 = t1 / k1 pAdd = $(( pAdd * ( 1 - r1/r0 ) / ( 1 - k0 / k1 ) )) # k0/k1 is effectively r1/r0 in ideal circumstances where increasing worker count does not change average runtime per worker

That's a clever approach! Using the ratio of execution times to directly adjust pAdd is elegant and avoids the need for separate state variables. Your formula effectively captures the desired behavior: no reduction if adding workers doesn't impact runtime, and complete halting of spawning if the ratio remains constant.

A few minor suggestions:

Clarity: Add a comment explaining the logic behind the formula. While the derivation is clear to you now, it might be less obvious to someone else (or even to you in the future). Something like:

# Adjust pAdd based on the change in per-worker runtime. # If adding workers doesn't improve throughput (t1*k0 >= t0*k1), reduce pAdd proportionally. pAdd = $(( ( pAdd * ( t0 * k1 - t1 * k0 ) ) / ( t0 * ( k1 - k0 ) ) ))

Division by zero: While unlikely, t0 or (k1 - k0) could theoretically be zero. Adding a check to prevent division by zero would make the code more robust:

if (( t0 != 0 && k1 != k0 )); then pAdd=$(( ( pAdd * ( t0 * k1 - t1 * k0 ) ) / ( t0 * ( k1 - k0 ) ) )) fi

Overflow: For very large values of pAdd, t0, or k1, the intermediate calculations could potentially overflow. Consider using bc for arbitrary precision arithmetic if this is a concern. This might add a slight performance overhead, but it would guarantee correctness for extreme cases. For example:

pAdd=$(echo "scale=0; ($pAdd * ($t0 * $k1 - $t1 * $k0)) / ($t0 * ($k1 - $k0))" | bc)

Testing: With this new logic, it's crucial to test thoroughly. Try to simulate scenarios where:

Adding workers improves throughput significantly.

Adding workers has little or no impact.

Adding workers actually degrades performance (due to excessive overhead).

By testing these edge cases, you can ensure the formula behaves as expected and fine-tune the parameters if necessary.

Overall, this looks like a promising improvement. Let me know if you have any further questions or would like to discuss testing strategies.

regarding your suggestions:

agreed. ill add comments explaining the logic

division by 0 shoukd be impossible here. k1 is guaranteed to be more than k0 (else that loop iteration would have already continued and skipped this calculation). t0 is measured in microseconds (time difference between 2 $EPOCHREALTIME calls), and just checking the time takes more than a microsecond. But a quick check that t0 is nonzero couldnt hurt.

i dont think i have to worry too much about overflow, so long as this isnt run on a supercomputer where k0 might be a million.For running on a typical PC i dont see pAdd nor k0/k1 being more than 100 or so, and for thatr to overflow the average runtime t0 would be longer than a human lifetime.

Ive worked hard to minimize forkrun's dependency list and Id rather not add bc to it. I think it is more likely someone wuithout bc will want to run forkrun than it is someone running forkrun on a supercomputer.

Agreed. Ill run it through some real-world tests.

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

jkool702 and others added 29 commits November 25, 2024 01:17

Update lseek.c

2ea06a9

Update lseek.c

4c59045

start of reworking how SI/IEC prefixes parse

b10798a

fix getVal

80e1680

Update forkrun.bash

c7194c3

Update forkrun.bash

192b9d9

apply _forkrun_getVal to all user-provided numeric options so that any user-specified number can use a SI/IEC prefix

Update forkrun.bash

a5d2944

Fix _forkrun_getVal usage

Merge pull request #53 from jkool702/forkrun_testing_1a

7e19130

Update forkrun.bash

Update forkrun.bash

ba147a5

make bash completions faster

Update forkrun.bash

92f6511

Completely reworked the logic for how coprocs are dynamically spawned. Other various minor changes as well.

Update forkrun.bash

c80b43b

Update forkrun_flamechart.bash

6913311

Add files via upload

d1e80eb

Update forkrun.bash

7d0320d

Update forkrun.speedtest.hyperfine.bash

73bc1ee

Update forkrun.speedtest.hyperfine.bash

d3d8ce6

Update forkrun.speedtest.hyperfine.bash

12263d8

Add comments to pQueue describing new worker spawning logic

2a8c06d

fix regression incode to run hyperfine benchmark

45d25f8

improve comments on forkrun.bash

d6ec71f

Update forkrun.bash

390ab0d

Update forkrun.bash

f908c30

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

minor fix for hyperfine benchmark code

87f8577

fine-tune pQueue logic

245f8e7

fine tune pQueue logic

c3dd158

minor fix for hyperfine benchmark

1945e4f

fine tune pQueue logic

b90e41f

Merge pull request #55 from jkool702/forkrun_testing_nQueue_1

027916b

Update forkrun.bash

Merge pull request #54 from jkool702/forkrun_testing_nQueue

565c177

Update forkrun.bash

jkool702 and others added 18 commits January 18, 2025 23:48

Update forkrun.bash

0102fbd

makes a few variables local they had not yet been made local. fix minor mistakes in documentation made pulling file descriptor byte offset from procfs slightly more efficient

Update forkrun.bash

850f5cf

Initial support for a new '-n <#>' flag that limits the total number …

9a16cc5

…of arguments that forkrun will process before returning.

Update forkrun.bash

cf5e7b3

add new `-n` flag into help

Merge pull request #57 from jkool702/forkrun_testing_readLimit

119a2bd

Add -n option to limit input lines

Update forkrun.bash

72d889d

Update forkrun.bash

76458d9

Update forkrun.bash

8d3b4e9

Update forkrun.bash

edd97fd

Update forkrun.bash

2c0b8bb

Update forkrun.bash

12d1cb8

Merge pull request #58 from jkool702/forkrun_testing_test_1

f197ba2

Fix option parsing if options have whitespace characters

Update forkrun.bash

35a4a73

Merge pull request #59 from jkool702/forkrun_testing_test

1094697

use `head` to speed up `-n <#>` flag when possible + fix whitespace option parsing

Update README.md

56d2117

Update forkrun.bash

7d5bd5b

Update README.md

e57ed6d

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Update README.md

984b449

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

sourcery-ai bot reviewed Jan 23, 2025

View reviewed changes

jkool702 and others added 6 commits January 23, 2025 09:08

Update README.md

db93038

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Update lseek.c

a1aced6

expand unit tests + fix minor parsing error

7761bbd

Update forkrun.unit-tests.sorting.new.bash

e285225

improve unit testing script

Update forkrun.unit-tests.sorting.new.bash

bac82d7

Update forkrun.unit-tests.sorting.new.bash

b1c7d69

sourcery-ai bot reviewed Jan 29, 2025

View reviewed changes

jkool702 and others added 2 commits January 29, 2025 11:41

Update README.md

f458698

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Update README.md

c5f1a2c

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

	3. the lseek loadable builtin has been recompiled and now can be used on x86_64 and aarch64 and riscv64 architectures
	3. the lseek loadable builtin has been recompiled and now can be used on x86_64, aarch64, and riscv64 architectures

Improve numeric arg handling and add dynamic worker spawning #56

Are you sure you want to change the base?

Improve numeric arg handling and add dynamic worker spawning #56

Uh oh!

Conversation

jkool702 commented Jan 16, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Jan 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide by Sourcery

State diagram for dynamic worker spawning

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

jkool702 commented Jan 23, 2025

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Jan 23, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Jan 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sourcery-ai bot Jan 23, 2025

Choose a reason for hiding this comment

Uh oh!

jkool702 Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Jan 23, 2025

Choose a reason for hiding this comment

Uh oh!

jkool702 commented Jan 29, 2025

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sourcery-ai bot Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sourcery-ai bot Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

jkool702 Jan 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

jkool702 Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

jkool702 Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

jkool702 Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jkool702 commented Jan 16, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Jan 16, 2025 •

edited

Loading

jkool702 Jan 23, 2025 •

edited

Loading

jkool702 Jan 29, 2025 •

edited

Loading