Skip to content

[Draft] Nested sampling implementation #755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 181 commits into
base: main
Choose a base branch
from

Conversation

yallup
Copy link

@yallup yallup commented Nov 11, 2024

A few important guidelines and requirements before we can merge your PR:

  • If I add a new sampler, there is an issue discussing it already; Nested Sampling implementation #753
  • We should be able to understand what the PR does from its title only;
  • There is a high-level description of the changes;
  • There are links to all the relevant issues, discussions and PRs;
  • The branch is rebased on the latest main commit;
  • Commit messages follow these guidelines;
  • The code respects the current naming conventions;
  • Docstrings follow the numpy style guide
  • pre-commit is installed and configured on your machine, and you ran it before opening the PR;
  • There are tests covering the changes;
  • The doc is up-to-date;
  • If I add a new sampler* I added/updated related examples

Consider opening a Draft PR if your work is still in progress but you would like some feedback from other contributors.

High level description of changes

The following files are included in the folder ns:

  • base: The base nested sampler. Detailed more below, should be somewhat familiar to the SMC structure. Nested sampling as an outer kernel with a delete function (resampling) to remove the lowest likelihood points, and then maps a vectorized update over those “deleted” particles to replace them with new particles subject to the hard likelihood constraint.

  • adaptive: Similar to the SMC inner kernel tuning, wraps the base sampler with a parameter update function to tune the inner kernel parameters.

  • utils: Useful calculations, particularly for extracting log_weights, weighted (at a specified inverse temperature) samples of the target

  • vectorized_slice: A compatible inner kernel for the nested sampling kernels, this is non-standard for the rest of the library so opinions on how best to do this are most welcome, we tried to follow the SMC design of flexible choice of inner kernel, but currently only this one works... Currently this explicitly loads both the prior logdensity and loglikelihood as functions, as we would think about them in nested sampling. But I suspect there is a clever way to lift this to be in the mcmc folder, and overload the extra loglikelihood condition for use in nested sampling. For now we have a practical implementation here that works for our purpose. Currently this doesn’t use a proposal distribution as in the mh kernels, allowing a more flexible definition of a random slice direction, and instead hardcodes a derived from a covariance.

Out of these there are currently 3 top level APIs defined (which is somewhat overkill as things stand but hopefully it translates). Base and adaptive both have top level apis, named generically as per the usual design. Inside adaptive we have put a top level api for nss or "nested slice sampling", that loads explicitly the vectorized slice inner kernel and corresponding tuning.

Example usage

Lastly there is an example usage script (not to be included in the final PR but to help demonstrate how we intend these components to be used on a simple gaussian-gaussian integration problem. Under docs/examples/nested_sampling.py (this has an external dependency of distrax). I have added a number of inline comments in this to explain some choices, this would all be extended at some point and folded into the sampling book rather than here but I’ve included it as a tracked file for convenience.

As there are quite a few non standard parts here I will submit this as a draft PR for now, hoping for some higher level feedback before getting too far into the weeds. Hopefully there is enough here for you to go on @AdrienCorenflos as an initial look and the example works out of the box for you.

williamjameshandley and others added 30 commits May 17, 2025 12:18
…pler and update Nested Sampling

This commit enhances the Slice Sampler (`blackjax.mcmc.ss`) with the ability to handle generic constraints beyond the log-density slice itself. Nested Sampling (`blackjax.ns`) implementations are refactored to leverage this new capability for enforcing the likelihood constraint.

Slice Sampler (`mcmc.ss`):
- The `kernel` and `horizontal_slice_proposal` functions now accept `constraint_fn`, `constraint`, and `strict` arguments. These allow specifying an additional function whose output must satisfy a given bound for a proposal to be considered "within" the slice.
- During the stepping-out and shrinking procedures, proposed points `x` are now checked against `constraint_fn(x) > constraint` (or `>=` if `strict` is False) in addition to `logdensity_fn(x) >= log_slice_height`.
- `SliceInfo` now includes `constraint` (the value of `constraint_fn` at the accepted point).
- Type hints for `l_steps`, `r_steps`, `s_steps`, and `evals` in `SliceInfo` are changed from `Array` to `int`.

Nested Sampling (`ns`):
- The previous approach in `ns.base` of creating a `constrained_logdensity_fn` (which combined prior and likelihood constraint by returning -inf) has been removed.
- The `inner_kernel` in `ns.base` and `ns.adaptive` (and their initializers) now explicitly receives `logprior_fn`, `loglikelihood_fn`, and `loglikelihood_0`.
- In `ns.nss` (Nested Slice Sampling):
    - The `inner_kernel` now passes `logprior_fn` as the `logdensity_fn` to the slice sampler.
    - The likelihood constraint (`loglikelihood_fn(x) > loglikelihood_0`) is passed as the new explicit constraint to the slice sampler using `constraint_fn`, `constraint`, and `strict`.
    - Introduced `NSSInnerState` and `NSSStepInfo` for better state and information management within the nested slice sampling context.
- `inner_init_fn` signatures in NS modules are updated to reflect the separation of prior, likelihood, and the particle's current state.

This change decouples the likelihood constraint logic from the main log-density function passed to the slice sampler. This leads to a cleaner interface and a more direct way of handling constraints within Nested Sampling, particularly for the likelihood threshold.

It also reduces the number of calls to the likelihood and prior, and gives users access to the slice sampling chain results for further analysis.
…ce sampling

- Add tests/ns/test_nested_sampling.py with 9 tests covering:
  * Base nested sampling initialization and particle deletion
  * Adaptive nested sampling parameter updates
  * Nested slice sampling direction functions and kernel construction
  * Utility functions for log-volume simulation and live point counting

- Add tests/mcmc/test_slice_sampling.py with 11 tests covering:
  * Slice sampler initialization and vertical slice height sampling
  * Multi-dimensional slice sampling (1D, 2D, 5D)
  * Constrained slice sampling and direction generation
  * Hit-and-run slice sampling top-level API
  * Statistical correctness validation with robust error handling

- All tests follow BlackJAX conventions using chex.TestCase, parameterized
  testing, and proper shape/tree validation
- Tests are optimized for fast execution with reduced sample sizes
- Comprehensive coverage of both core functionality and edge cases

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Remove incorrect assertion that number of live points must be monotonically decreasing
- In real nested sampling, live points follow a sawtooth pattern as particles die and are replenished
- Fix evidence estimation test to use proper utility functions instead of skipping
- Create more realistic mock data with varied birth likelihoods
- Add proper documentation explaining why monotonic decrease assumption is wrong
- Reference: Fowlie et al's plateau nested sampling shows live points can increase

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Remove unused imports from test files (functools, numpy, blackjax, etc.)
- Fix MyPy type annotation issues:
  * Add Dict type annotation for inner_kernel_params in base.py
  * Rename duplicate nested function names in ss.py (shrink_body_fun, shrink_cond_fun)
  * Add proper type annotation for params parameter in nss.py
  * Add None check for optional update_inner_kernel_params_fn in adaptive.py
- Update test comment to clarify scope rather than suggesting "skip"
- Apply Black formatting and fix import ordering
- All pre-commit hooks now pass successfully

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…sampling

- Replace weak evidence tests with proper analytic validation
- Test unnormalized Gaussian likelihood with uniform prior using exact analytic solution
- Validate evidence estimates against analytic values using statistical consistency
- Generate 500-1000 Monte Carlo evidence samples to test distribution properties
- Check that analytic evidence falls within 95-99% confidence intervals
- Add test cases for:
  * Unnormalized Gaussian exp(-0.5*x²) with uniform prior [-3,3]
  * Narrow prior challenging case with full Gaussian likelihood
  * Constant likelihood case for baseline validation
- Use proper Bayesian evidence formula: Z = ∫ p(x) * L(x) dx
- Statistical validation with confidence intervals rather than arbitrary tolerances
- Addresses requirement for evidence correctness testing with known analytic solutions

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
This commit refactors state management in the Slice Sampler and simplifies
the inner kernel handling in the Nested Sampling base.

**Slice Sampler (`mcmc.ss`):**
- `SliceState` now includes a `logslice` field (defaulting to `jnp.inf`)
  to store the height sampled during the vertical slice.
- `vertical_slice` now takes the full `SliceState`, calculates `logslice`,
  and returns an updated `SliceState`. Its signature changes from
  `(rng_key, logdensity_fn, position)` to `(rng_key, state)`.
- `horizontal_slice` now takes `SliceState` (which includes `logslice`)
  instead of separate `x0` and `log_slice_height`. Its signature changes
  to accept `state: SliceState` as its second argument.
- These changes centralize slice-related information, improving data flow
  within the slice sampling kernel. `ss.init` remains the primary way to
  initialize `SliceState`, and `ss.build_kernel` API is unchanged.

**Nested Sampling (`ns.base`, `ns.nss`):**
- Removed the `inner_kernel_init_fn` parameter from `ns.base.build_kernel`.
  The `NSInnerState` is now always initialized directly within the NS loop
  before calling the vmapped `inner_kernel`. This is a breaking API change
  for users who might have been providing a custom `inner_kernel_init_fn`.
- Introduced `ns.base.new_state_and_info` helper function to standardize
  the creation of `NSInnerState` and `NSInnerInfo`.
- `ns.nss.inner_kernel` is adapted to use the updated `SliceState` from
  `mcmc.ss` and now utilizes the `new_state_and_info` helper.

These modifications enhance code clarity and maintainability. Tests for
slice sampling and nested sampling have been updated accordingly.
- Rename NSInnerState/NSInnerInfo to PartitionedState/PartitionedInfo for posterior repartitioning
- Add comprehensive docstrings explaining separation of log-prior and log-likelihood components
- Improve type annotations and docstring consistency across slice sampling modules
- Standardize SliceInfo with default values for clean initialization
- Fix function signatures and return types throughout nested sampling codebase
- Enable posterior repartitioning techniques through explicit component separation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…ampling

Clean up type annotations by removing redundant # type: ignore directives:
- Remove type ignore from SamplingAlgorithm return in nss.py
- Remove type ignores from NSInfo constructor in utils.py

All type checks now pass without suppression directives.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Fix broadcasting error by vmapping over both random keys and inner state.
The inner kernel expects single particles, not batches, so we need to
vmap over the PartitionedState structure (axis 0) in addition to the keys.

This resolves the "mul got incompatible shapes for broadcasting" error
that occurred when constraint functions tried to evaluate likelihood
on vectorized inputs.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
feat: Major nested sampling and slice sampling algorithm enhancements
This PR is a large refactor of the base functions presented originally:
The core Nested Sampling framework (base, adaptive, utils) has been significantly refactored for enhanced flexibility, improved state tracking, and a clearer API.
The previous vectorized_slice module is removed, superseded by the more general HRSS implementation, which has been included as a more general mcmc kernel.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants