Skip to content

Address Sort<T, TComparer> extensions performance #39543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from

Conversation

nietras
Copy link
Contributor

@nietras nietras commented Jul 17, 2020

#39466
@jkotas first draft. Please take a look and let me know if this looks like it is on the right path.

TODO:

  • Fix ArraySortHelper.Mono.cs

@Dotnet-GitSync-Bot
Copy link
Collaborator

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

@nietras
Copy link
Contributor Author

nietras commented Jul 17, 2020

@jkotas I'm struggling with getting a good dev loop here, resorting to running:

D:\oss\runtime [sorting-tcomparer ≡]> ./build.cmd -subset Clr.CoreLib

on the command line. And VS won't open the System.Private.CoreLib project in System.Private.CoreLib.sln, any pointers on this? Couldn't find much on CoreLib in docs that helped me.

@jkotas
Copy link
Member

jkotas commented Jul 17, 2020

VS won't open the System.Private.CoreLib project

It works for me. Do you have the most recent VS update?

@ghost
Copy link

ghost commented Jul 17, 2020

Tagging subscribers to this area: @eiriktsarpalis
Notify danmosemsft if you want to be subscribed.

@nietras
Copy link
Contributor Author

nietras commented Jul 18, 2020

It works for me. Do you have the most recent VS update?

Version 16.6.4 which should be latest. But I guess I need a preview of preview.8 it seems:

D:\oss\runtime\src\coreclr\src\System.Private.CoreLib\System.Private.CoreLib.csproj : error  : The project file cannot be opened by the project system, because it is missing some critical imports or the referenced SDK cannot be found.

Detailed Information:
Unable to locate the .NET Core SDK. Check that it is installed and that the version specified in global.json (if any) matches the installed version.

Project "C:\System.Private.CoreLib\src\System.Private.CoreLib.Shared.projitems" was not imported by "D:\oss\runtime\src\coreclr\src\System.Private.CoreLib\System.Private.CoreLib.csproj" at (332,3), due to the file not existing.

Project "D:\oss\runtime\src\coreclr\src\System.Private.CoreLib\codeOptimization.targets" was not imported by "D:\oss\runtime\src\coreclr\src\System.Private.CoreLib\System.Private.CoreLib.csproj" at (346,3), due to the file not existing.

global.json says 5.0.100-preview.8.20362.3

@nietras
Copy link
Contributor Author

nietras commented Jul 18, 2020

Will try latest version from https://github.com/dotnet/installer#installers-and-binaries e.g. dotnet-sdk-5.0.100-rc.1.20367.2-win-x64.exe

Seems to work! 👍

nietras added 2 commits July 18, 2020 20:46
…ySortHelper for TKey,TValue scenario, should provide speedup for this.
@jkotas
Copy link
Member

jkotas commented Jul 18, 2020

I guess I need a preview of preview.8 it seems:

Yes, it is required to make VS work well. We have it mentioned here https://github.com/dotnet/runtime/blob/master/docs/workflow/requirements/windows-requirements.md#net-sdk

{
// Add a try block here to detect IComparers (or their
// underlying IComparables, etc) that are bogus.
try
{
comparer ??= Comparer<T>.Default;
IntrospectiveSort(keys, comparer.Compare);
if (comparer is null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will create two instantiations of the sorting code: One on Comparer<T> and second on TComparer.

Can the null check be pushed out to the callers to avoid the duplication where possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas what if we put !typeof(TComparer).IsValueType && comparer is null?

Copy link
Member

@jkotas jkotas Jul 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comparer is null is JITed into a constant already for structs (that are not Nullable<T>). I do not think this would help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I thought so, perhaps I don't fully understand your issue here then given only reference type TComparer should be an issue but that should have a canonical instantiation or? would you mind expanding?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, when this is called from here:

 https://github.com/dotnet/runtime/pull/39543/files#diff-d4e4a789c4e124d267dde2cf6505da8eR1760

comparer will be null and so we will always take the first branch (as long as this is the only Sort use for the given T). The JIT or AOT won't be able to figure it out. They will create both generic instantiations of the sorting algorithm.

a canonical instantiation

Yes, instantiations over reference types share code, but there is still duplication of the type system structures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is still duplication of the type system structures.

Right, of course :) Just ball-parking here, but would the following then not ensure we only have a single type system structure for reference type TComparer?

                if (typeof(TComparer).IsValueType)
                {
                    ComparerArraySortHelper<TKey, TValue, TComparer>
                        .IntrospectiveSort(keys, values, comparer);
                }
                else 
                {
                    IComparer<TKey> referenceComparer = comparer ?? Comparer<TKey>.Default;
                    ComparerArraySortHelper<TKey, TValue, IComparer<TKey>>
                        .IntrospectiveSort(keys, values, referenceComparer);
                }

@jkotas
Copy link
Member

jkotas commented Jul 18, 2020

Did you have a chance to get some performance numbers? I am curious what the perf is going to look like.

@nietras
Copy link
Contributor Author

nietras commented Jul 19, 2020

Did you have a chance to get some performance numbers? I am curious what the perf is going to look like.

Not yet, wanted to finalize impl first. But on that note do you know which Benchmarks Stephen Toub used and where I can find these? Doesn't look like dotnet/performance. We can start with the latter, though, perhaps.

@stephentoub
Copy link
Member

(I used the *sort* tests from dotnet/performance, as well as others like the ones listed in #37941 (comment), and also yours ;-))

@nietras
Copy link
Contributor Author

nietras commented Jul 19, 2020

@jkotas I can't get:

.\build.cmd -c release

to compile and hence I cannot get a CoreRun.exe for benchmarking. I get errors like:

D:\oss\runtime\src\libraries\shims\ApiCompat.proj(93,5): error : TypesMustExist : Type 'System.WindowsRuntimeSystemExtensions' does not exist in the implementation but it does exist in the contract.
D:\oss\runtime\src\libraries\shims\ApiCompat.proj(93,5): error : TypesMustExist : Type 'System.IO.WindowsRuntimeStorageExtensions' does not exist in the implementation but it does exist in the contract.
D:\oss\runtime\src\libraries\shims\ApiCompat.proj(93,5): error : TypesMustExist : Type 'System.IO.WindowsRuntimeStreamExtensions' does not exist in the implementation but it does exist in the contract.
D:\oss\runtime\src\libraries\shims\ApiCompat.proj(93,5): error : TypesMustExist : Type 'System.Runtime.InteropServices.WindowsRuntime.AsyncInfo' does not exist in the implementation but it does exist in the contract.

I think I have seen this before, but for the life of me can't remember or find what to do? 😅

@nietras
Copy link
Contributor Author

nietras commented Jul 19, 2020

I can build the same code without my changes fine e.g. 2a1595e but for this PR and branch it fails with above errors and:

error : ApiCompat failed comparing netstandard to netcoreapp

@nietras
Copy link
Contributor Author

nietras commented Jul 19, 2020

just taking notes as I'm trying to find a solution. See https://github.com/dotnet/runtime/blob/master/docs/coding-guidelines/updating-ref-source.md this say one could:

dotnet build /p:RunApiCompat=false

@nietras
Copy link
Contributor Author

nietras commented Jul 19, 2020

./build.cmd -clean
./build.cmd -c Release

🤦‍♂️

Build succeeded.
    0 Warning(s)
    0 Error(s)

😁

D:\oss\dotnet-performance\src\benchmarks\micro [master ≡]> dotnet run -c Release -f netcoreapp5.0 --filter *.Sort*.Array* --statisticalTest 3ms --coreRun "D:\oss\runtime-m\artifacts\bin\testhost\net5.0-Windows_NT-Release-x64\shared\Microsoft.NETCore.App\5.0.0\CoreRun.exe" "D:\oss\runtime-pr\artifacts\bin\testhost\net5.0-Windows_NT-Release-x64\shared\Microsoft.NETCore.App\5.0.0\CoreRun.exe"

🤞

@nietras
Copy link
Contributor Author

nietras commented Jul 19, 2020

Well first benchmark run is a bust. Something isn't right. 🤔 runtime-m is master. runtime-pr is this PR.

Int32

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19041.388 (2004/?/20H1)
Intel Core i7-8700 CPU 3.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.100-rc.1.20367.2
  [Host]     : .NET Core 5.0.0 (CoreCLR 5.0.20.36102, CoreFX 5.0.20.36102), X64 RyuJIT
  Job-YBDUGJ : .NET Core 5.0 (CoreCLR 42.42.42.42424, CoreFX 42.42.42.42424), X64 RyuJIT
  Job-TJRYUK : .NET Core 5.0 (CoreCLR 42.42.42.42424, CoreFX 42.42.42.42424), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable  InvocationCount=5000  
IterationTime=250.0000 ms  MaxIterationCount=20  MinIterationCount=15  
UnrollFactor=1  WarmupCount=1  
Method Job Toolchain Size Mean Error StdDev Median Min Max Ratio MannWhitney(3ms) RatioSD Gen 0 Gen 1 Gen 2 Allocated
Array Job-YBDUGJ \runtime-m 512 3.806 μs 0.8246 μs 0.8099 μs 3.480 μs 3.111 μs 5.593 μs 1.00 Base 0.00 - - - -
Array Job-TJRYUK \runtime-pr 512 6.625 μs 1.2828 μs 1.3726 μs 6.186 μs 3.979 μs 9.961 μs 1.79 Same 0.36 - - - -
Array_ComparerClass Job-YBDUGJ \runtime-m 512 18.755 μs 0.4069 μs 0.4354 μs 18.620 μs 18.301 μs 19.691 μs 1.00 Base 0.00 - - - 64 B
Array_ComparerClass Job-TJRYUK \runtime-pr 512 20.257 μs 0.1806 μs 0.1601 μs 20.274 μs 20.019 μs 20.554 μs 1.08 Same 0.03 - - - -
Array_ComparerStruct Job-YBDUGJ \runtime-m 512 22.307 μs 0.2758 μs 0.2445 μs 22.366 μs 21.706 μs 22.589 μs 1.00 Base 0.00 - - - 88 B
Array_ComparerStruct Job-TJRYUK \runtime-pr 512 25.287 μs 0.4659 μs 0.4130 μs 25.153 μs 24.791 μs 26.363 μs 1.13 Same 0.02 - - - 24 B
Array_Comparison Job-YBDUGJ \runtime-m 512 18.629 μs 0.3121 μs 0.2767 μs 18.590 μs 18.329 μs 19.224 μs 1.00 Base 0.00 - - - -
Array_Comparison Job-TJRYUK \runtime-pr 512 18.249 μs 0.2300 μs 0.1796 μs 18.249 μs 17.992 μs 18.547 μs 0.98 Same 0.01 - - - -

Look at the Array test with a massive regression, and we haven't changed that code.

IntClass

This appears more inline with expected. Note that Comparison path in both cases sees no regressions. 👍 The ComparerClass bodes investigation perhaps. ComparerStruct shows allocated bytes, so something is off there too.

Method Job Toolchain Size Mean Error StdDev Median Min Max Ratio MannWhitney(3ms) Gen 0 Gen 1 Gen 2 Allocated
Array Job-YBDUGJ \runtime-m 512 27.42 μs 0.269 μs 0.238 μs 27.35 μs 27.18 μs 27.90 μs 1.00 Base - - - -
Array Job-TJRYUK \runtime-pr 512 27.18 μs 0.244 μs 0.203 μs 27.27 μs 26.76 μs 27.46 μs 0.99 Same - - - -
Array_ComparerClass Job-YBDUGJ \runtime-m 512 36.44 μs 0.233 μs 0.207 μs 36.49 μs 36.16 μs 36.79 μs 1.00 Base - - - 64 B
Array_ComparerClass Job-TJRYUK \runtime-pr 512 40.47 μs 0.315 μs 0.279 μs 40.42 μs 40.06 μs 41.03 μs 1.11 Same - - - -
Array_ComparerStruct Job-YBDUGJ \runtime-m 512 41.46 μs 0.100 μs 0.089 μs 41.45 μs 41.33 μs 41.65 μs 1.00 Base - - - 88 B
Array_ComparerStruct Job-TJRYUK \runtime-pr 512 41.34 μs 0.226 μs 0.200 μs 41.38 μs 40.95 μs 41.65 μs 1.00 Same - - - 24 B
Array_Comparison Job-YBDUGJ \runtime-m 512 37.32 μs 0.456 μs 0.426 μs 37.09 μs 36.90 μs 38.23 μs 1.00 Base - - - -
Array_Comparison Job-TJRYUK \runtime-pr 512 35.77 μs 0.153 μs 0.136 μs 35.73 μs 35.61 μs 36.06 μs 0.96 Same - - - -

results.zip

@nietras
Copy link
Contributor Author

nietras commented Jul 20, 2020

I did a quick benchmark run with ETWProfiler for both m and pr. Looking at int case only, to try to see why this regresses so much even though code hasn't changed. I can't see why from the profiled in perfview. There is no notable difference.

m

dotnet run -c Release -f netcoreapp5.0 --filter *.Sort<int*.Array* --profiler ETW --coreRun "D:\oss\runtime-m\artifacts\bin\testhost\net5.0-Windows_NT-Release-x64\shared\Microsoft.NETCore.App\5.0.0\CoreRun.exe

image

pr

dotnet run -c Release -f netcoreapp5.0 --filter *.Sort<int*.Array* --profiler ETW --coreRun "D:\oss\runtime-pr\artifacts\bin\testhost\net5.0-Windows_NT-Release-x64\shared\Microsoft.NETCore.App\5.0.0\CoreRun.exe"

image

Note: Followed https://adamsitnik.com/ETW-Profiler/ for this, since not fresh on my mind. :)

@nietras
Copy link
Contributor Author

nietras commented Jul 20, 2020

Reran the benchmarks but with CoreRun.exe parameters reversed. This removes the difference between the two for comparable path. Seems to be quite a bit of noise in that measurement.

dotnet run -c Release -f netcoreapp5.0 --filter *.Sort<int*.Array* --coreRun "D:\oss\runtime-pr\artifacts\bin\testhost\net5.0-Windows_NT-Release-x64\shared\Microsoft.NETCore.App\5.0.0\CoreRun.exe" "D:\oss\runtime-m\artifacts\bin\testhost\net5.0-Windows_NT-Release-x64\shared\Microsoft.NETCore.App\5.0.0\CoreRun.exe"
Method Job Toolchain Size Mean Error StdDev Median Min Max Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
Array Job-WVHVEM \runtime-m\ 512 3.338 μs 0.1089 μs 0.1254 μs 3.332 μs 3.161 μs 3.539 μs 1.02 0.06 - - - -
Array Job-TCYRLH \runtime-pr\ 512 3.281 μs 0.1022 μs 0.1136 μs 3.254 μs 3.071 μs 3.489 μs 1.00 0.00 - - - -
Array_ComparerClass Job-WVHVEM \runtime-m\ 512 18.756 μs 0.0653 μs 0.0545 μs 18.757 μs 18.661 μs 18.891 μs 0.91 0.00 - - - 64 B
Array_ComparerClass Job-TCYRLH \runtime-pr\ 512 20.696 μs 0.0289 μs 0.0241 μs 20.696 μs 20.655 μs 20.728 μs 1.00 0.00 - - - -
Array_ComparerStruct Job-WVHVEM \runtime-m\ 512 23.301 μs 0.1071 μs 0.0950 μs 23.286 μs 23.156 μs 23.481 μs 0.87 0.00 - - - 88 B
Array_ComparerStruct Job-TCYRLH \runtime-pr\ 512 26.943 μs 0.0288 μs 0.0240 μs 26.946 μs 26.902 μs 26.988 μs 1.00 0.00 - - - 24 B
Array_Comparison Job-WVHVEM \runtime-m\ 512 18.795 μs 0.0376 μs 0.0314 μs 18.787 μs 18.758 μs 18.849 μs 1.00 0.01 - - - -
Array_Comparison Job-TCYRLH \runtime-pr\ 512 18.815 μs 0.2308 μs 0.1928 μs 18.724 μs 18.682 μs 19.345 μs 1.00 0.00 - - - -

@nietras
Copy link
Contributor Author

nietras commented Jul 20, 2020

@jkotas if we focus on the int scenario first, there is issues around ComparerClass and ComparerStruct, any comments on that?

The supposed allocations on ComparerStruct I don't get. Nor the benchmark numbers. This should be faster than ComparerClass.

@nietras
Copy link
Contributor Author

nietras commented Jul 20, 2020

Ha had a file length issue causing me not to be able to load etl files in PerfView. Here is the ComparerStruct. Problem is the Compare call isn't getting inlined. There are also a number of methods not getting inlined. This comparer is defined like below in dotnet/performance, which is problematic for reference types, but should work for value types. AggressiveInlining might help... still not sure what the allocs are about... WAIT why is it using canonical ComparerArraySortHelper?? 😅 Clearly I have a code issue somewhere ... I'll try to track it down later.

        private readonly struct ComparableComparerStruct : IComparer<T>
        {
            public int Compare(T x, T y) => x.CompareTo(y);
        }

image

@nietras
Copy link
Contributor Author

nietras commented Jul 20, 2020

@jkotas could the issue above be related to how the instance is created via below?

typeof(GenericArraySortHelper<string, Comparer<string>>).TypeHandle

Scratch that, the isssue is in the benchmark using Array.Sort overloads, not span ones. 🤦‍♂️

        [Benchmark]
        public void Array_ComparerStruct() => System.Array.Sort(_arrays[_iterationIndex++], 0, Size, new ComparableComparerStruct());

this needs to use span based API of course. Guess should add span based API to benchmarks.

@nietras
Copy link
Contributor Author

nietras commented Jul 20, 2020

PR Int32

There you go.

Method Size Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
Span 512 4.982 μs 1.7252 μs 1.9868 μs 3.347 μs 3.115 μs 7.461 μs - - - -
Span_ComparerClass 512 20.886 μs 0.0362 μs 0.0303 μs 20.896 μs 20.826 μs 20.927 μs - - - -
Span_ComparerStruct 512 4.070 μs 0.4887 μs 0.5628 μs 3.693 μs 3.419 μs 4.750 μs - - - -
Span_Comparison 512 19.440 μs 0.0417 μs 0.0370 μs 19.443 μs 19.360 μs 19.503 μs - - - -

@nietras
Copy link
Contributor Author

nietras commented Jul 20, 2020

To sum up, running the Span based API micro benchmarks with proper support for TComparer is shown below. I don't think we should pay much notice to the plain Span test as this is too noisy. There is a fundamental issue with how the benchmarks are defined and using the same 5000 invocation count for all tests. I think I will refactor the tests to avoid this and to allow showing some extra perf scenarios.

Anyway, the main issue is the ComparerClass case sees a minor regression. Not sure if this is due the delegate Comparison<T> vs virtual method call perf differences. In micro benchmarks around this I didn't see that big differences though. Any ideas?

Int32

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19041.388 (2004/?/20H1)
Intel Core i7-8700 CPU 3.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.100-rc.1.20367.2
  [Host]     : .NET Core 5.0.0 (CoreCLR 5.0.20.36102, CoreFX 5.0.20.36102), X64 RyuJIT
  Job-SQSCEM : .NET Core 5.0 (CoreCLR 42.42.42.42424, CoreFX 42.42.42.42424), X64 RyuJIT
  Job-GAEOPC : .NET Core 5.0 (CoreCLR 42.42.42.42424, CoreFX 42.42.42.42424), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable  InvocationCount=5000  
IterationTime=250.0000 ms  MaxIterationCount=20  MinIterationCount=15  
UnrollFactor=1  WarmupCount=1  
Method Job Size Mean Error StdDev Median Min Max Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
Span Job-SQSCEM \runtime-m 512 5.149 μs 1.1501 μs 1.3244 μs 6.037 μs 3.123 μs 6.147 μs 1.62 0.43 - - - -
Span Job-GAEOPC \runtime-pr 512 3.190 μs 0.0665 μs 0.0766 μs 3.198 μs 3.056 μs 3.308 μs 1.00 0.00 - - - -
Span_ComparerClass Job-SQSCEM \runtime-m 512 18.779 μs 0.0545 μs 0.0455 μs 18.784 μs 18.707 μs 18.853 μs 0.92 0.00 - - - 64 B
Span_ComparerClass Job-GAEOPC \runtime-pr 512 20.330 μs 0.0654 μs 0.0546 μs 20.325 μs 20.252 μs 20.436 μs 1.00 0.00 - - - -
Span_ComparerStruct Job-SQSCEM \runtime-m 512 22.569 μs 0.0681 μs 0.0569 μs 22.549 μs 22.484 μs 22.690 μs 5.76 0.89 - - - 88 B
Span_ComparerStruct Job-GAEOPC \runtime-pr 512 3.798 μs 0.5000 μs 0.5758 μs 3.450 μs 3.363 μs 4.791 μs 1.00 0.00 - - - -
Span_Comparison Job-SQSCEM \runtime-m 512 18.730 μs 0.2270 μs 0.1895 μs 18.646 μs 18.607 μs 19.197 μs 1.00 0.01 - - - -
Span_Comparison Job-GAEOPC \runtime-pr 512 18.748 μs 0.2070 μs 0.1616 μs 18.696 μs 18.634 μs 19.234 μs 1.00 0.00 - - - -

@nietras
Copy link
Contributor Author

nietras commented Jul 20, 2020

@jkotas latest results via dotnet/performance#1400

results.zip

Two regressions remain:

  • ComparerClass - can be resolved by using ObjectComparisonComparer and still doing the delegate alloc as before. Personally not happy about the alloc but for now focus should be on getting TComparer support.
  • Comparison for BigStruct - this is a little worse and might require JIT changes. Haven't profiled it yet though. A small focused benchmark could perhaps illustrate the problem and an issue could be filed for that.

Int32

Why is the value type comparer scenario faster than simple Span, I have no idea yet.

Method Job Toolchain Size Mean Error StdDev Median Min Max Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
Span Job-RSHRZO *\runtime-m* 512 5.531 μs 1.1816 μs 1.2643 μs 6.175 μs 3.084 μs 6.234 μs 1.00 0.00 - - - -
Span Job-LPTXAZ \runtime-pr\ 512 5.515 μs 0.8999 μs 1.0363 μs 5.951 μs 3.103 μs 5.984 μs 1.00 0.14 - - - -
Span_ComparerClassGeneric Job-RSHRZO *\runtime-m* 512 18.816 μs 0.0340 μs 0.0318 μs 18.813 μs 18.748 μs 18.868 μs 1.00 0.00 - - - 64 B
Span_ComparerClassGeneric Job-LPTXAZ \runtime-pr\ 512 20.895 μs 0.0191 μs 0.0179 μs 20.892 μs 20.868 μs 20.925 μs 1.11 0.00 - - - -
Span_ComparerClassSpecific Job-RSHRZO *\runtime-m* 512 18.747 μs 0.0319 μs 0.0298 μs 18.740 μs 18.690 μs 18.802 μs 1.00 0.00 - - - 64 B
Span_ComparerClassSpecific Job-LPTXAZ \runtime-pr\ 512 20.795 μs 0.0306 μs 0.0271 μs 20.788 μs 20.761 μs 20.864 μs 1.11 0.00 - - - -
Span_ComparerStructGeneric Job-RSHRZO *\runtime-m* 512 23.156 μs 0.0439 μs 0.0389 μs 23.140 μs 23.100 μs 23.230 μs 1.00 0.00 - - - 88 B
Span_ComparerStructGeneric Job-LPTXAZ \runtime-pr\ 512 3.468 μs 0.0691 μs 0.0613 μs 3.460 μs 3.395 μs 3.630 μs 0.15 0.00 - - - -
Span_ComparerStructSpecific Job-RSHRZO *\runtime-m* 512 23.198 μs 0.0369 μs 0.0345 μs 23.191 μs 23.134 μs 23.275 μs 1.00 0.00 - - - 88 B
Span_ComparerStructSpecific Job-LPTXAZ \runtime-pr\ 512 3.510 μs 0.0968 μs 0.1036 μs 3.481 μs 3.372 μs 3.766 μs 0.15 0.00 - - - -
Span_Comparison Job-RSHRZO *\runtime-m* 512 18.689 μs 0.0456 μs 0.0426 μs 18.678 μs 18.616 μs 18.773 μs 1.00 0.00 - - - -
Span_Comparison Job-LPTXAZ \runtime-pr\ 512 18.796 μs 0.0413 μs 0.0386 μs 18.800 μs 18.729 μs 18.861 μs 1.01 0.00 - - - -

BigStruct

Method Job Toolchain Size Mean Error StdDev Median Min Max Ratio Gen 0 Gen 1 Gen 2 Allocated
Span Job-BYCMUO *\runtime-m* 512 7.949 μs 0.0817 μs 0.0764 μs 7.913 μs 7.874 μs 8.102 μs 1.00 - - - -
Span Job-SZWOVL \runtime-pr\ 512 7.885 μs 0.0324 μs 0.0271 μs 7.892 μs 7.827 μs 7.923 μs 0.99 - - - -
Span_ComparerClassGeneric Job-BYCMUO *\runtime-m* 512 26.615 μs 0.0404 μs 0.0358 μs 26.614 μs 26.548 μs 26.671 μs 1.00 - - - 64 B
Span_ComparerClassGeneric Job-SZWOVL \runtime-pr\ 512 28.723 μs 0.0398 μs 0.0373 μs 28.731 μs 28.655 μs 28.796 μs 1.08 - - - -
Span_ComparerClassSpecific Job-BYCMUO *\runtime-m* 512 26.738 μs 0.0658 μs 0.0583 μs 26.743 μs 26.632 μs 26.817 μs 1.00 - - - 64 B
Span_ComparerClassSpecific Job-SZWOVL \runtime-pr\ 512 28.367 μs 0.0700 μs 0.0655 μs 28.361 μs 28.259 μs 28.515 μs 1.06 - - - -
Span_ComparerStructGeneric Job-BYCMUO *\runtime-m* 512 30.873 μs 0.0998 μs 0.0885 μs 30.874 μs 30.763 μs 31.029 μs 1.00 - - - 88 B
Span_ComparerStructGeneric Job-SZWOVL \runtime-pr\ 512 12.050 μs 0.0245 μs 0.0229 μs 12.046 μs 12.016 μs 12.101 μs 0.39 - - - -
Span_ComparerStructSpecific Job-BYCMUO *\runtime-m* 512 31.069 μs 0.0912 μs 0.0853 μs 31.062 μs 30.938 μs 31.198 μs 1.00 - - - 88 B
Span_ComparerStructSpecific Job-SZWOVL \runtime-pr\ 512 11.358 μs 0.0228 μs 0.0213 μs 11.362 μs 11.318 μs 11.387 μs 0.37 - - - -
Span_Comparison Job-BYCMUO *\runtime-m* 512 27.030 μs 0.0514 μs 0.0481 μs 27.017 μs 26.963 μs 27.112 μs 1.00 - - - -
Span_Comparison Job-SZWOVL \runtime-pr\ 512 33.273 μs 0.0448 μs 0.0397 μs 33.278 μs 33.186 μs 33.341 μs 1.23 - - - -

Command line

D:\oss\dotnet-performance\src\benchmarks\micro [sort-span ≡]> dotnet run -c Release -f netcoreapp5.0 --filter *.Sort*.Span* --coreRun "D:\oss\runtime-m\artifacts\bin\testhost\net5.0-Windows_NT-Release-x64\shared\Microsoft.NETCore.App\5.0.0\CoreRun.exe" "D:\oss\runtime-pr\artifacts\bin\testhost\net5.0-Windows_NT-Release-x64\shared\Microsoft.NETCore.App\5.0.0\CoreRun.exe"

@stephentoub
Copy link
Member

@nietras, are you still working on this? Thanks.

@nietras
Copy link
Contributor Author

nietras commented Oct 23, 2020

@stephentoub I'm waiting for #39732 to be resolved 😅

@stephentoub
Copy link
Member

Thanks. @AndyAyersMS, it seems like that's unlikely to be addressed in the foreseeable future?

@AndyAyersMS
Copy link
Member

@CarolEidt is this one of the struct issues we've considered as part of .Net 6 planning?

@CarolEidt
Copy link
Contributor

is this one of the struct issues we've considered as part of .Net 6 planning?

No, this is more related to optimization of structs, while the struct work that we've planned for .Net 6 (thus far) is focused primarily on completing the work to ensure that structs passed in registers don't needlessly get forced to the stack.

@ViktorHofer
Copy link
Member

// Auto-generated message

69e114c which was merged 12/7 removed the intermediate src/coreclr/src/ folder. This PR needs to be updated as it touches files in that directory which causes conflicts.

To update your commits you can use this bash script: https://gist.github.com/ViktorHofer/6d24f62abdcddb518b4966ead5ef3783. Feel free to use the comment section of the gist to improve the script for others.

@stephentoub
Copy link
Member

@AndyAyersMS, recommendations on how to proceed here then?

@AndyAyersMS
Copy link
Member

I've pinged @sandreenko on the linked issue to assess for .Net 6. I'm hoping we can implement some forms of struct copy elimination.

@ghost ghost closed this Feb 24, 2021
@ghost
Copy link

ghost commented Feb 24, 2021

Draft Pull Request was automatically closed for inactivity. It can be manually reopened in the next 30 days if the work resumes.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 26, 2021
@stephentoub
Copy link
Member

@stephentoub I'm waiting for #39732 to be resolved 😅

@nietras, now that #39732 was addressed, want to have another go at this?

@dotnet dotnet unlocked this conversation Sep 19, 2023
@nietras
Copy link
Contributor Author

nietras commented Sep 26, 2023

now that #39732 was addressed, want to have another go at this?

@stephentoub hi! Sorry, priorities have shifted since and focusing my spare time on other OSS efforts like https://github.com/nietras/Sep :) I do hope this gets added anyway 🙏

@ghost ghost locked as resolved and limited conversation to collaborators Oct 26, 2023
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants