-
Notifications
You must be signed in to change notification settings - Fork 229
Feature Optional backends functionality #538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Optional backends functionality #538
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Noting down a few thoughts:
|
w.r.t.
Do you mean having a tensorflow bucket and a TensorFlow-probability bucket that also has tensorflow or just having a single tensorflow bucket which includes tensorflow-probability? I think I'm generally in favour of the former.
agreed! |
@@ -11,7 +47,14 @@ | |||
"VAE", | |||
"VAEGMM", | |||
"resnet", | |||
"scale_by_instance", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to double check; is scale_by_instance
promoted from tensorflow.resnet
to tensorflow
so that we can do the import_optional
check in __init__
? This makes sense I think, although a minor point is that we would ordinarily want to raise a DeprecationWarning
if the old import usage from alibi_detect.models.tensorflow.resnet import scale_by_instance
is used. Perhaps not important in this case as scale_by_instance
is not used a lot...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's exactly correct. I'll add the relevent DepreciationWarnings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see this related comment!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving this one "unresolved" for visibility...
@@ -16,5 +40,6 @@ | |||
"predict_batch_transformer", | |||
"get_device", | |||
"quantile", | |||
"zero_diag" | |||
"zero_diag", | |||
"TorchDataset" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General point here: Where we have promoted (public!) imports up to a higher submodule (or just moved things around entirely), IMO we at a minimum need to document the changes properly in the release notes. Better yet we should raise DeprecationWarning
's if the old imports are used.
An example is the old from alibi_detect.utils.saving import save_detector
has been moved to from alibi_detect.saving import save_detector
, however calling the former will still import save_detector
and raise a DeprecationWarning
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me and @ascillitoe discussed options:
The best way of providing depreciation warnings is to move the functionality in alibi_detect.utils.pytorch.misc
to alibi_detect.utils.pytorch._misc
. We then update the __init__.py
file to import from _misc.py
:
get_device, quantile, zero_diag = import_optional(
'alibi_detect.utils.pytorch._misc',
names=['get_device', 'quantile', 'zero_diag']
)
In the misc.py
file we replace the relevant functions with a depreciation warning similar to in the config work. I think this makes sense but the result would be inconsistencies in what is private and what is public in a wider sense throughout detect. I could try and make this consistent but this would mean renaming lots of files with the underscore...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO moving backend-specific files to private (to avoid imports without optional deps checks) might be a worthwhile exercise in a subsequent PR (to include in 0.10.0
release). However, personally I would only vote for this change if it doesn't screw up the git history for these files too much i.e. if we can get the git history to just track the filename change instead of registering it as a complete deletion and rewrite of each file.
@@ -112,7 +112,7 @@ | |||
], | |||
"metadata": { | |||
"kernelspec": { | |||
"display_name": "Python 3 (ipykernel)", | |||
"display_name": "Python 3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usual comment wrt to notebooks; ideally would reset these changes to reduce file diffs (same for all following notebooks).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @mauicv, done a first pass. Looking good I think, but will hold off from approving until a few queries are resolved.
import tensorflow_probability as tfp | ||
from tensorflow.keras.losses import kld, categorical_crossentropy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uber pendantic nitpick, could just remove these changes and changes to test_losses_tf.py
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not sure I understand?
@@ -123,7 +123,7 @@ | |||
"```python\n", | |||
"import tensorflow as tf\n", | |||
"from tensorflow.keras.layers import Conv2D, Flatten, Input\n", | |||
"from alibi_detect.utils.tensorflow.kernels import DeepKernel\n", | |||
"from alibi_detect.utils.tensorflow import DeepKernel\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change also needs to be made in main text: The DeepKernel class found in either alibi_detect.utils.tensorflow.kernels or alibi_detect.utils.pytorch.kernels
Might be worth grep-ing the docs for all the import statements that have been changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good idea. I found alibi_detect.utils.pytorch.data.TorchDataset
and alibi_detect.utils.tensorflow.data.TFDataset
as well, These are referenced in a parameter docstring. Are they intended to be public? Would a user import and use them? They're not used explicitly anywhere within the examples!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In offline discussion agreed the TorchDataset
and TFDataset
's should be public so I've protected them with import_optional
and fixed private import paths in the docs.
* Add BackendValidator class * Protect ad, od and cd and other API objects from tensorflow and torch optional dependency import errors * Update import statements in notebooks
* Add BackendValidator class * Protect ad, od and cd and other API objects from tensorflow and torch optional dependency import errors * Update import statements in notebooks
* Implement base optional dependency functionality * Feature Optional backends functionality (#538) * Merge config into opt deps (#553) * Feature Optional prophet (#541) * Feature Optional dependencies documentation updates (#542) * Update licenses (#562) * Add updates to CHANGELOG for optional dependencies (#563)
Todo:
What is this
This work branches off
feature/dep-management
and will maketensorflow
andpytorch
optional dependencies ofalibi-detect
.Two main objectives are:
1. Make
tensorflow
andpytorch
optional dependencies:I've combined these into this single PR because they're all quite interdependent and so It was hard to do them separately without writing code to manage functionality implemented in one but not the others.
2. Add BackendValidation for all detectors:
This wasn't such an issue in alibi as only CFRL implemented multiple backends. In
detect
however there are multiple cases where we have atensorflow
,PyTorch
orsklearn
backend and on top of this sometimes the tensorflow backend is dependent ontensorflow-probability
as well. Previously this was handled using duplicated code in each detector. I've added aBackenValidator
that replaces this and issues more specific error messages for optional dependencies.What this PR doesn't include:
README.md
for same reason as 1.Notes:
Circular imports:
I'm not sure why this wasn't an issue with alibi but in alibi-detect often we have the following scenario:
- Import model from models
- In importing the above model it needs a function from
utils.distance
- Importing this function means the code in
utils.__init__
is executed- This code imports
utils.fetching
which in turn imports all the models frommodels
- We get a circular import error
This is an issue that's introduced by us trying to bundle all the functionality into the init file and the correct approach is further scoping and modularization. The solution to the above is a
utils/fetching/__init__.py
as separate from theutils/__init__.py
Changing import statements:
This PR results in lots of changes to the notebooks primarily in the form of changes to the notebooks to reflect the new intended public API. Note that these shouldn't be breaking changes as the objects in question should still be in the old locations the only difference is that where they need to be protected from
ImportErrors
we import them into the__init__.py
file in a module instead of from the file directly. So old code will still run but the object imported won't be protected as perimport_optional
. This might mean code will break if users are running default installs as they won't have all the relevant dependencies. Any release should incorporate a note that a quick fix is updating their installs topip install alibi[x,y,z]
for relevantx, y, z
. A longer fix is to update the import statements to the public API. Most of the time this won't even be necessary, just where imports have changed as:to
The full list of changes is as follows:
from alibi_detect.utils.tensorflow.kernels import DeepKernel
->from alibi_detect.utils.tensorflow import DeepKernel
from alibi_detect.utils.tensorflow.prediction import predict_batch
->from alibi_detect.utils.tensorflow import predict_batch
from alibi_detect.utils.pytorch.data import TorchDataset
->from alibi_detect.utils.pytorch import TorchDataset
from alibi_detect.models.pytorch.trainer import trainer
->from alibi_detect.models.pytorch import trainer
from alibi_detect.models.tensorflow.resnet import scale_by_instance
->from alibi_detect.models.tensorflow import scale_by_instance
from alibi_detect.models.tensorflow.resnet import scale_by_instance
->from alibi_detect.models.tensorflow import scale_by_instance
from alibi_detect.utils.pytorch.kernels import DeepKernel
->from alibi_detect.utils.pytorch import DeepKernel
from alibi_detect.models.tensorflow.autoencoder import eucl_cosim_features
->from alibi_detect.models.tensorflow import eucl_cosim_features
from alibi_detect.utils.tensorflow.prediction import predict_batch
->from alibi_detect.utils.tensorflow import predict_batch
from alibi_detect.models.tensorflow.losses import elbo
->from alibi_detect.models.tensorflow import elbo
from alibi_detect.models import PixelCNN
->from alibi_detect.models.tensorflow import PixelCNN
from alibi_detect.utils.tensorflow.data import TFDataset
->from alibi_detect.utils.tensorflow import TFDataset
from alibi_detect.utils.pytorch.data import TorchDataset
->from alibi_detect.utils.pytorch import TorchDataset
also see this related comment!