-
Notifications
You must be signed in to change notification settings - Fork 119
[Draft] Nested sampling implementation #755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
yallup
wants to merge
181
commits into
blackjax-devs:main
Choose a base branch
from
handley-lab:nested_sampling
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Rejection sampling
Observations: - float32 is oddly discretised, even within the typical set in moderate dimensions (10) - the evidences aren't adding up, so we probably need to debug something (we think unrelated to discretisation) - this means we need to resolve the plateau case if we want to run on GPU
…pler and update Nested Sampling This commit enhances the Slice Sampler (`blackjax.mcmc.ss`) with the ability to handle generic constraints beyond the log-density slice itself. Nested Sampling (`blackjax.ns`) implementations are refactored to leverage this new capability for enforcing the likelihood constraint. Slice Sampler (`mcmc.ss`): - The `kernel` and `horizontal_slice_proposal` functions now accept `constraint_fn`, `constraint`, and `strict` arguments. These allow specifying an additional function whose output must satisfy a given bound for a proposal to be considered "within" the slice. - During the stepping-out and shrinking procedures, proposed points `x` are now checked against `constraint_fn(x) > constraint` (or `>=` if `strict` is False) in addition to `logdensity_fn(x) >= log_slice_height`. - `SliceInfo` now includes `constraint` (the value of `constraint_fn` at the accepted point). - Type hints for `l_steps`, `r_steps`, `s_steps`, and `evals` in `SliceInfo` are changed from `Array` to `int`. Nested Sampling (`ns`): - The previous approach in `ns.base` of creating a `constrained_logdensity_fn` (which combined prior and likelihood constraint by returning -inf) has been removed. - The `inner_kernel` in `ns.base` and `ns.adaptive` (and their initializers) now explicitly receives `logprior_fn`, `loglikelihood_fn`, and `loglikelihood_0`. - In `ns.nss` (Nested Slice Sampling): - The `inner_kernel` now passes `logprior_fn` as the `logdensity_fn` to the slice sampler. - The likelihood constraint (`loglikelihood_fn(x) > loglikelihood_0`) is passed as the new explicit constraint to the slice sampler using `constraint_fn`, `constraint`, and `strict`. - Introduced `NSSInnerState` and `NSSStepInfo` for better state and information management within the nested slice sampling context. - `inner_init_fn` signatures in NS modules are updated to reflect the separation of prior, likelihood, and the particle's current state. This change decouples the likelihood constraint logic from the main log-density function passed to the slice sampler. This leads to a cleaner interface and a more direct way of handling constraints within Nested Sampling, particularly for the likelihood threshold. It also reduces the number of calls to the likelihood and prior, and gives users access to the slice sampling chain results for further analysis.
…ce sampling - Add tests/ns/test_nested_sampling.py with 9 tests covering: * Base nested sampling initialization and particle deletion * Adaptive nested sampling parameter updates * Nested slice sampling direction functions and kernel construction * Utility functions for log-volume simulation and live point counting - Add tests/mcmc/test_slice_sampling.py with 11 tests covering: * Slice sampler initialization and vertical slice height sampling * Multi-dimensional slice sampling (1D, 2D, 5D) * Constrained slice sampling and direction generation * Hit-and-run slice sampling top-level API * Statistical correctness validation with robust error handling - All tests follow BlackJAX conventions using chex.TestCase, parameterized testing, and proper shape/tree validation - Tests are optimized for fast execution with reduced sample sizes - Comprehensive coverage of both core functionality and edge cases 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Remove incorrect assertion that number of live points must be monotonically decreasing - In real nested sampling, live points follow a sawtooth pattern as particles die and are replenished - Fix evidence estimation test to use proper utility functions instead of skipping - Create more realistic mock data with varied birth likelihoods - Add proper documentation explaining why monotonic decrease assumption is wrong - Reference: Fowlie et al's plateau nested sampling shows live points can increase 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Remove unused imports from test files (functools, numpy, blackjax, etc.) - Fix MyPy type annotation issues: * Add Dict type annotation for inner_kernel_params in base.py * Rename duplicate nested function names in ss.py (shrink_body_fun, shrink_cond_fun) * Add proper type annotation for params parameter in nss.py * Add None check for optional update_inner_kernel_params_fn in adaptive.py - Update test comment to clarify scope rather than suggesting "skip" - Apply Black formatting and fix import ordering - All pre-commit hooks now pass successfully 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
…sampling - Replace weak evidence tests with proper analytic validation - Test unnormalized Gaussian likelihood with uniform prior using exact analytic solution - Validate evidence estimates against analytic values using statistical consistency - Generate 500-1000 Monte Carlo evidence samples to test distribution properties - Check that analytic evidence falls within 95-99% confidence intervals - Add test cases for: * Unnormalized Gaussian exp(-0.5*x²) with uniform prior [-3,3] * Narrow prior challenging case with full Gaussian likelihood * Constant likelihood case for baseline validation - Use proper Bayesian evidence formula: Z = ∫ p(x) * L(x) dx - Statistical validation with confidence intervals rather than arbitrary tolerances - Addresses requirement for evidence correctness testing with known analytic solutions 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
This commit refactors state management in the Slice Sampler and simplifies the inner kernel handling in the Nested Sampling base. **Slice Sampler (`mcmc.ss`):** - `SliceState` now includes a `logslice` field (defaulting to `jnp.inf`) to store the height sampled during the vertical slice. - `vertical_slice` now takes the full `SliceState`, calculates `logslice`, and returns an updated `SliceState`. Its signature changes from `(rng_key, logdensity_fn, position)` to `(rng_key, state)`. - `horizontal_slice` now takes `SliceState` (which includes `logslice`) instead of separate `x0` and `log_slice_height`. Its signature changes to accept `state: SliceState` as its second argument. - These changes centralize slice-related information, improving data flow within the slice sampling kernel. `ss.init` remains the primary way to initialize `SliceState`, and `ss.build_kernel` API is unchanged. **Nested Sampling (`ns.base`, `ns.nss`):** - Removed the `inner_kernel_init_fn` parameter from `ns.base.build_kernel`. The `NSInnerState` is now always initialized directly within the NS loop before calling the vmapped `inner_kernel`. This is a breaking API change for users who might have been providing a custom `inner_kernel_init_fn`. - Introduced `ns.base.new_state_and_info` helper function to standardize the creation of `NSInnerState` and `NSInnerInfo`. - `ns.nss.inner_kernel` is adapted to use the updated `SliceState` from `mcmc.ss` and now utilizes the `new_state_and_info` helper. These modifications enhance code clarity and maintainability. Tests for slice sampling and nested sampling have been updated accordingly.
- Rename NSInnerState/NSInnerInfo to PartitionedState/PartitionedInfo for posterior repartitioning - Add comprehensive docstrings explaining separation of log-prior and log-likelihood components - Improve type annotations and docstring consistency across slice sampling modules - Standardize SliceInfo with default values for clean initialization - Fix function signatures and return types throughout nested sampling codebase - Enable posterior repartitioning techniques through explicit component separation 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
…ampling Clean up type annotations by removing redundant # type: ignore directives: - Remove type ignore from SamplingAlgorithm return in nss.py - Remove type ignores from NSInfo constructor in utils.py All type checks now pass without suppression directives. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Fix broadcasting error by vmapping over both random keys and inner state. The inner kernel expects single particles, not batches, so we need to vmap over the PartitionedState structure (axis 0) in addition to the keys. This resolves the "mul got incompatible shapes for broadcasting" error that occurred when constraint functions tried to evaluate likelihood on vectorized inputs. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
feat: Major nested sampling and slice sampling algorithm enhancements
This PR is a large refactor of the base functions presented originally: The core Nested Sampling framework (base, adaptive, utils) has been significantly refactored for enhanced flexibility, improved state tracking, and a clearer API. The previous vectorized_slice module is removed, superseded by the more general HRSS implementation, which has been included as a more general mcmc kernel.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A few important guidelines and requirements before we can merge your PR:
main
commit;pre-commit
is installed and configured on your machine, and you ran it before opening the PR;Consider opening a Draft PR if your work is still in progress but you would like some feedback from other contributors.
High level description of changes
The following files are included in the folder
ns
:base
: The base nested sampler. Detailed more below, should be somewhat familiar to the SMC structure. Nested sampling as an outer kernel with a delete function (resampling) to remove the lowest likelihood points, and then maps a vectorized update over those “deleted” particles to replace them with new particles subject to the hard likelihood constraint.adaptive
: Similar to the SMC inner kernel tuning, wraps the base sampler with a parameter update function to tune the inner kernel parameters.utils
: Useful calculations, particularly for extracting log_weights, weighted (at a specified inverse temperature) samples of the targetvectorized_slice
: A compatible inner kernel for the nested sampling kernels, this is non-standard for the rest of the library so opinions on how best to do this are most welcome, we tried to follow the SMC design of flexible choice of inner kernel, but currently only this one works... Currently this explicitly loads both the prior logdensity and loglikelihood as functions, as we would think about them in nested sampling. But I suspect there is a clever way to lift this to be in the mcmc folder, and overload the extra loglikelihood condition for use in nested sampling. For now we have a practical implementation here that works for our purpose. Currently this doesn’t use a proposal distribution as in the mh kernels, allowing a more flexible definition of a random slice direction, and instead hardcodes a derived from a covariance.Out of these there are currently 3 top level APIs defined (which is somewhat overkill as things stand but hopefully it translates). Base and adaptive both have top level apis, named generically as per the usual design. Inside adaptive we have put a top level api for
nss
or "nested slice sampling", that loads explicitly the vectorized slice inner kernel and corresponding tuning.Example usage
Lastly there is an example usage script (not to be included in the final PR but to help demonstrate how we intend these components to be used on a simple gaussian-gaussian integration problem. Under
docs/examples/nested_sampling.py
(this has an external dependency of distrax). I have added a number of inline comments in this to explain some choices, this would all be extended at some point and folded into the sampling book rather than here but I’ve included it as a tracked file for convenience.As there are quite a few non standard parts here I will submit this as a draft PR for now, hoping for some higher level feedback before getting too far into the weeds. Hopefully there is enough here for you to go on @AdrienCorenflos as an initial look and the example works out of the box for you.