Proposed Feature Extension for CausalPy: Automatic Detection of Intervention Timing #478
Replies: 5 comments 2 replies
-
Discrete vs continuousIn terms of parameter estimation we want to estimate continuous quantities wherever possible, to take advantage of the more efficient sampling algorithms. One way we can do this is by avoiding use of If the treatment effect is modelled as a step change, this can become problematic, so a trick is to use a sigmoid function where there is a bit of smoothness to the step change, which helps the sampler get gradient information. |
Beta Was this translation helpful? Give feedback.
-
Types of treatment effect / change being detectedThe idea of doing more than just detecting a step change (as in a simple change point model) is very good. So a step change, or a slope change is good, but perhaps not sufficient to be useful in lots of situations. I can imagine many scenarios where you might expect a change to be transient. Either because the treatment period has a start and stop date, or because the treatment creates an initial effect which then dies down despite the intervention still being in place. How to deal with that is another matter, I'll try to drop in some ideas in response to your proposed algorithmic approach. |
Beta Was this translation helpful? Give feedback.
-
Implementation / algorithmI don't have a fixed idea about how this could work. But the default way how I'd think about it is that you're trying to come up with a linear decomposition of the data. One part of that is a treatment effect component - this could be a parameterised function:
|
Beta Was this translation helpful? Give feedback.
-
PrototypeTaking into account your feedback, I've put together a prototype of what this model could look like.
To maintain flexibility, users can optionally provide their own PyMC model to model the base time series (i.e. without interventions). The only requirement is that the model defines a base_mu variable of the same dimension as the time series, accessible via model.named_vars["base_mu"]. If no custom model is provided, the class builds a simple linear model by default:
For now, the intervention logic is less flexible. The user specifies which types of intervention effects to include using the effect parameter, which can contain any combination of: "level" — a discrete shift in mean after the change point "trend" — a change in slope after the change point "impulse" — a decaying impulse following the intervention Here’s the current implementation:
I've added the "impulse" parameter to capture short-term effects with exponential decay as you thought. It seems to work quite well, as shown below: I've also added two built-in methods to visualize the model's behavior. The example below uses the COVID-19 dataset used in the "How-to" for the InterruptedTimeSeries on CausalPy website with only the "impulse" effect applied. The plots show that the model struggles to converge on a clear switchpoint, likely due to unmodeled seasonality in the data. This suggests that extending the model to account for seasonal patterns may be necessary for better performance. Let me know your thoughts on this !Next possible steps :
|
Beta Was this translation helpful? Give feedback.
-
Update on two mattersSeasonalityI implemented a seasonal component to better capture recurring patterns in the data, and the improvement in model performance is promising so far. Since I’m working with monthly data, I started with 12 season-specific parameters (one for each month):
Next step: I plan to generalize this so the model can automatically adapt to different seasonalities — for example, weekly, quarterly, or any other cycle length — based on the number of seasons and observations per season. ImpulseI noticed that the impulse component I previously used was leaking into the time series before the switchpoint. To mitigate this, I modified the formulation to make the impulse symmetric and centered at the switchpoint by applying an absolute value:
This way, although the exponential decay (with the absolute value) creates a symmetric shape around the switchpoint, the sigmoid effectively suppresses any contribution before the switchpoint, ensuring the impulse activates only after the intervention. This makes the impulse more aligned with a realistic, one-sided causal effect. ResultsBelow are the updated results using this improved formulation, applied to the same dataset I mentioned in my previous post. The model now captures both seasonality and temporary shifts more reliably. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
This feature aims to extend the existing functionality "Interrupted Time Series" in CausalPy by allowing users to infer the timing of an intervention, rather than requiring it to be specified in advance.
What the feature would provide
Instead of specifying when an intervention occurred, the user describes the type of effect they expect the intervention to have : level shift, trend change, or both. The model then uses this structure to infer the most likely time at which such a change occurred.
The user may:
The model includes default pre- and post-intervention structures, but users can optionally define their own. Also, users may specify the expected form of the intervention effect, guiding which model to use and getting more precision. At the end, the users will have the possibility to get :
Bayesian default model
One widely used and easy-to-implement Bayesian model for detecting intervention effects is presented by Xueheng Shi et al, 2022. This model, which has been tested across various scenarios, is versatile and capable of detecting both level shifts and trend changes.
The structure supports a discrete change point and allows the slope and intercept to vary before and after the intervention. However, model precision can potentially be improved by simplifying the structure when prior knowledge is available — for instance, using a pure level-shift model (by removing time dependence) or a pure trend-change model (by removing the intercept discontinuity).
The full model is expressed as:
Proposal: Integrating Automatic Intervention Time Detection into InterruptedTimeSeries
What I could see would be to make a new class in pymc_models.py that Interrupted Series would rely on to estimate the Intervention time. Interrupted Series would automatically rely on it when given a range of time or no time for the Intervetion time. Also, we could add a new optional parameter that would get the (optional) parameters to send to initialize the class (effect, pre and post model). After the intervention time is estimated, Interrupted time series would continue as usual with the most likely intervention time as intervention time.
To support automatic inference of the intervention time, I propose creating a new class in pymc_models.py that encapsulates the logic for estimating when the intervention most likely occurred. The InterruptedTimeSeries class would automatically delegate to this new class when:
We could also introduce a new optional argument in InterruptedTimeSeries to allow users to pass in:
The workflow would then become :
What do you think ?
Here I've gathered all my thoughts so far and outlined how I think this feature could work best. Since I'm still new to CausalPy, I’m looking forward to your feedback! Please let me know if there’s anything I should be careful about or any suggestions before I start implementing. I’m excited to get started and appreciate any guidance you can provide.
Example
Consider a time series where an intervention causes the trend to increase from 0.1 to 0.25, along with a sudden level jump of 2 :
I’ve implemented the model, which follows the default structure described earlier :
After sampling, here are the typical plot outputs users can expect to see from this new feature:
Beta Was this translation helpful? Give feedback.
All reactions