Skip to content

Allow users to define custom priors #387

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
drbenvincent opened this issue Jul 12, 2024 · 8 comments
Open

Allow users to define custom priors #387

drbenvincent opened this issue Jul 12, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@drbenvincent
Copy link
Collaborator

High level goals

  • The API should include default priors and not require the user to define their own priors.
  • But the user should be able to customise their own priors.
  • Users should be able to ask the experiments what the default priors are, and use that as part of a workflow in going from default priors to customising priors for specific situations.
  • Use of custom priors should be documented not just in the API docs, but we should also have a number of examples sprinkled in to multiple (existing) example notebooks.

Implementation

We can perhaps learn from pymc-marketing which started off allowing users to specify priors by providing dicts but which has moved to having a prior class (see pymc-labs/pymc-marketing#759). We may find that dicts are fine for our purposes, but there could be advantages of going down the path of using classes. (Tagging @wd60622)

@drbenvincent drbenvincent added the enhancement New feature or request label Jul 12, 2024
@drbenvincent
Copy link
Collaborator Author

See the discussion here pymc-devs/pymc#7416 on whether to port the Prior class from pymc-marketing into the main pymc-repo. This would be amazing because we could add this new functionality with very little change to CausalPy itself.

@williambdean
Copy link
Contributor

williambdean commented Jul 30, 2024

Based on the discussion, it seems that the Prior class is most likely best suited for pymc-experimental. However, I don't have a timeline for that just yet.

The code is a single file so it would be easy to copy and tweak to your liking.

With access to the Prior class, the custom priors doesn't seems very difficult since the dims are already being specified in the classes.
Some things to consider:

  • pass the configuration into the class keeps consistency with previous API. I've been using dict[str, Prior] mainly but using keywords is alsso good way to restrict to only the allowed variables. A check for dict keys is also possible.
  • It's possible to add some additional checks to ensure that the custom prior will work with the model.
    • dim name works. dist = Prior("Normal", mu=0, sigma=1, dims="coeffs"); if dist.dims != ("coeffs", ): raise ...
    • restrict the type of the distribution. dist = Prior("Normal", mu=0, sigma=1, dims="coeffs"); if dist.distribution != "Normal": raise ...
    • etc

Based on the old API, it might look like this:

from pymc_marketing.prior import Prior
from causalpy.pymc_models import WeightedSumFitter

priors = {
    "beta": Prior("Dirichlet", a=[1, 2, 3, 4], dims="coeffs"), 
    "sigma": Prior("Gamma", mu=1.5, sigma=0.25),
}
model = WeightedSumFitter(priors=priors)

# Check the priors
print(model.priors)

model.fit(X, y)

@drbenvincent
Copy link
Collaborator Author

Note to self: Ideally we wait for pymc-devs/pymc-extras#448 (maybe done by @williambdean) then we use that :)

@drbenvincent
Copy link
Collaborator Author

drbenvincent commented May 30, 2025

Boom! #448 pymc-devs/pymc-extras#448 is now closed. Thanks @williambdean.

This issue is now unblocked and we can implement custom priors in CausalPy! Well, it will be when there's a new release of pymc-extras, though nothing from stopping us from using the development version right now :)

I'm aiming on getting to this in June - but it someone has the desire and feels the they could do it, let us know.

@williambdean
Copy link
Contributor

williambdean commented Jun 3, 2025

Could add in a new initialization parameter here. Say, priors which would be a dictionary of VariableFactory

https://github.com/pymc-labs/causalpy/blob/bb7c2bbea029c3563251d6416d020543a00ed2b1/causalpy/pymc_models.py?plain=1#L71-L72

That would make the syntax:

import causalpy as cp
from pymc_extras.prior import Prior

priors = {
  "beta": Prior("Normal", mu=0, sigma=5, dims="coeffs")

}

df = cp.load_data("did")
result = cp.DifferenceInDifferences(
    df,
    formula="y ~ 1 + group*post_treatment",
    time_variable_name="t",
    group_variable_name="group",
    model=cp.pymc_models.LinearRegression(
        sample_kwargs=sample_kwargs,
        priors=priors,
  ),
)

There would be some default priors for each of the models:

https://github.com/pymc-labs/causalpy/blob/bb7c2bbea029c3563251d6416d020543a00ed2b1/causalpy/pymc_models.py?plain=1#L238-L240

default_priors = {
    "beta": Prior("Normal", mu=0, sigma=50, dims="coeffs"),
    # This name would have to be changed
    # "sigma": Prior("HalfNormal", sigma=1), 
    "y_hat": Prior(
        "Normal", 
        sigma=Prior("Normal", sigma=1), 
        dims="obs_ind",
    ),
}

These would be the defaults and would be updated {**default_priors, **initialization_priors}

Warning

There would be some backwards compat concerns because of the auto-naming that the prior class has with sub-> parameters. This would affect the likelihood nested parameters. For example "sigma" in the example above would become "y_hat_sigma". However, this could be renamed or duplicated in the posterior Xarray

@drbenvincent
Copy link
Collaborator Author

That all looks pretty good, and aligns with what I was expecting.

It could be cool to think about auto scaling the default prior to the data (similar to how bambi does it). That might negate the need to consider scaling the data? My preference is to avoid pre and post transformation steps if possible and rely on auto adjusting priors.

@drbenvincent
Copy link
Collaborator Author

I think that could be done in the experiment class. That would have both the data and the model. It could cash something like self.model.autoscale_priors(self.X). And that method could edit the priors. Not sure if I've done that yet, but would presumably involve pm.observe or pm.do.

@williambdean
Copy link
Contributor

I'm not sure how bambi does it but it sounds like multiple steps and multiple PRs to me.

Parameters can be scaled or the variables can be scaled which might be a bit more generalized.

Maybe prior some helper functions that can return a dictionary of priors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants