Skip to content

[SYSTEMDS-3669] Computation of Shapley Values #1946

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 35 commits into from

Conversation

louislepage
Copy link
Contributor

This is a PR for a master thesis about computation of shapley values with systemds at scale.

@louislepage
Copy link
Contributor Author

louislepage commented Dec 11, 2023

I rewrote the generic shapley value computation with sampling and added an example script, as well as a jupyter notebook in which i compared the results of the official SHAP package and my systemds implementation.

The reults (at least in the case of scaled data) look good, however I found that the rbind() calls during preparation of the instances matrix are very slow and take the longest. Therefor I would revisit this to make further optimizations, but I think I need some advice on how make those appends/writes to large matrices faster.

Here is a quick plot of the computed values for the 107 features of the tranformencoded adult dataset.
Both implementations used the full ~32000 samples as background data and ran for 10000 iterations for each sample.
image

@louislepage louislepage marked this pull request as ready for review December 11, 2023 07:48
@louislepage
Copy link
Contributor Author

I was able to run it on 5.000 to 50.000 samples and, as expected, it did scale linearly.
However this also showed, that the python implementation just has a huge overhead for small sample sizes, but scales at virtually the same rate.

image

And rbind() is still the slowest single operation for very large sample-sizes, but i was unable to directly write to rows of a pre-allocated matrix, which could be benifical in this case.

Heavy hitter instructions:
  #  Instruction                    Time(s)  Count
  1  shapley_sampling                56.988      1
  2  shapley_sampling_prepare        29.756    107
  3  rbind                           22.870      2
  4  !                               10.097    107
  5  append                           6.878    219
  6  rand                             4.483    219
  7  leftIndex                        3.574    321

@Baunsgaard
Copy link
Contributor

Hi @louislepage

Once you want a review, mark it in this PR.

As an initial comment, try to avoid including the run times or other CSV files, and modify your Python notebook to be scripts that run instead. Removing the notebook gives us clearer ways of comparing the performance and removes the need to start a notebook.

Once you are happy with the behavior of your Shapley operator it would be great if you move it from staging to a builtin function.
If it is unclear how to do this, comment on this PR, and we will help you out.

Thanks!

@louislepage
Copy link
Contributor Author

Thanks @Baunsgaard for your feedback! :)

I rewrote the notebook as a python script and removed the CSVs.
But i still have to fix an rbind issue in the shapley computation in systemds.

I will let you know as soon as I am done with this.
Also, I think @christinadionysio told me, she would look after this PR, if I understood her correctly, as she is supervising my thesis.

Anyways, I still have to do some work on this before it can be moved to a builtin-function, so I better get to it! :D

@christinadionysio
Copy link
Contributor

Thank you @louislepage for this contribution.

As we talked about in the last meeting the code looks good so far.
Please add the license header to the test_runtimes.sh file.
Additionally, I created a jira issue for your thesis, so you can replace [WIP] by [SYSTEMDS-3669].

@louislepage louislepage changed the title [WIP] Computation of Shapley Values [SYSTEMDS-3669] Computation of Shapley Values Feb 7, 2024
@j143 j143 added this to the systemds-3.2.0 milestone Feb 8, 2024
@louislepage louislepage force-pushed the shap-values branch 2 times, most recently from 25d8af5 to 8937054 Compare March 23, 2024 09:01
Copy link

codecov bot commented Jul 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (main@505f871). Learn more about missing BASE report.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1946   +/-   ##
=======================================
  Coverage        ?   68.82%           
  Complexity      ?    40707           
=======================================
  Files           ?     1440           
  Lines           ?   161565           
  Branches        ?    31418           
=======================================
  Hits            ?   111200           
  Misses          ?    41303           
  Partials        ?     9062           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@louislepage louislepage force-pushed the shap-values branch 2 times, most recently from 0bc46f4 to 80d837e Compare July 25, 2024 05:00
@louislepage
Copy link
Contributor Author

After finishing the thesis, I started moving the explainer was to builtin as scripts/builtin/shapExplainer.dml.

For the final merge, all files under scripts/staging/shapley_values will be removed. The experiments are available as part of the reproducibility repository of the thesis.

Unit- and component tests were added in test/functions/builtin/part2/BuiltinShapExplainerTest.java which use tests written in DML in test/scripts/functions/builtin/shapExplainerUnit.dml and test/scripts/functions/builtin/shapExplainerComponent.dml. The DML file also stores the expected results for each test, results are compared within the javatest. This may be a bit unconventional, but it reduces the amount of files/code since otherwise all expected results would have to be written in java and each unit test would need to go to its own DML script. I hope this is fine.

Copy link
Contributor

@christinadionysio christinadionysio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution @louislepage! Overall it looks really good, I also like your reproducibility repository! I only left a few minor comments about formatting and tests.

# S Matrix holding the shapley values along the cols, one row per instance.
# expected Double holding the average prediction of all instances.
# -----------------------------------------------------------------------------
s_shapExplainer = function(String model_function, list[unknown] model_args, Matrix[Double] x_instances,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reformat the function declarations so that you use 4 spaces for every parameter that is introduced in a new line and 2 spaces for the return value and move the opening bracket { to a new line. Please double check for all other functions in this file.

example = function(String x, Double y, ...
    Matrix[Double] z)
  return Double r
{
}


# create row indicator vector ctable
perm_mask_rows = seq(1,perm_cols)
#TODO: col-vector and matrix mult?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this TODO still needed? If not please remove it and also double check for the other occurrences in this script.


@Test
public void testPrepareMaskForPermutation() {
runShapExplainerUnitTest("prepare_mask_for_permutation");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use these tests to test the whole functionality of the scripts (more like your component test). The test cases should cover different input parameters and edge cases. It would be great if you could modify the tests in that regard to keep them consistent with our other tests. Please add additional component tests for the different modes (HYBRID, SPARK).

@mboehm7
Copy link
Contributor

mboehm7 commented Oct 27, 2024

LGTM - thanks @louislepage for this great work. I will now merge it in, despite the remaining minor comments, in order to move this along and facilitate follow-up work. During the merge, I fixed some rebase conflicts, fixed the formatting of the java tests (tabs over spaces), eliminated remaining warnings, and added a FIXME for the unnecessary padding with zero in the test. Thanks.

@mboehm7 mboehm7 closed this in b651db5 Oct 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants