Skip to content

Feature request: add Nearest Neighbor Descent method #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Yunuuuu opened this issue Apr 25, 2025 · 5 comments
Open

Feature request: add Nearest Neighbor Descent method #30

Yunuuuu opened this issue Apr 25, 2025 · 5 comments

Comments

@Yunuuuu
Copy link

Yunuuuu commented Apr 25, 2025

The R package is deposited here: https://jlmelville.github.io/rnndescent/index.html

@LTLA
Copy link
Member

LTLA commented Apr 25, 2025

There are two possible approaches.

The first is to write knncolle wrappers around a hypothetical C++ library that implements the NN-descent approach. This would then be plug-and-play into BiocNeighbors with full compatibility across R and C++. I don't mind writing these wrappers, but it requires someone to first strip out the C++ code from rnndescent and make it work as a standalone library, see these old comments.

The second is to extend BiocNeighbors at the R level. This is easier and can be done by anyone, but precludes the use of NN-descent in C++ code across my other BioC packages. For example, you wouldn't be able to use NN-descent during MNN correction, or in SingleR, etc. Also, this would involve importing the rnndescent package, which I am reluctant to introduce to the BiocNeighbors dependency tree. I would prefer that it be written as a separate package that extends BiocNeighbors. In any case, I don't feel like doing this myself, so if you really like NN-descent, you may consider writing this package - see the BiocNeighbors vignette for extension instructions.

I also vaguely remember that NN-descent is intended for fine-tuning existing NN-search results, which complicates the situation. Should it be written as a "meta-algorithm" that requires another approximate search algorithm to generate the initial results? Something for the implementer to consider.

@Yunuuuu
Copy link
Author

Yunuuuu commented Apr 25, 2025

Thanks for the detailed explanation .

I actually gave the R-level implementation of NN-descent a try and attempted to wire it into BiocNeighbors, but I ran into some difficulties when integrating it with scran::correctmnn. Eventually, I gave up on this approach.

@Yunuuuu
Copy link
Author

Yunuuuu commented Apr 25, 2025

The UMAP based on the Nearest Neighbor Descent method is really elegant — it's the default in scanpy, though I’m not entirely sure if the beautiful results are purely due to the method itself. Either way, I’d love to have a similarly seamless experience within the Bioconductor ecosystem. That’s part of what motivated me to explore tighter integration of NN-descent, even if it’s been a bit tricky so far.

Image

Image

@LTLA
Copy link
Member

LTLA commented Apr 25, 2025

I don't have much to say on the aesthetic value of these plots, but if you think it's a worthwhile goal, you should probably check if you can reproduce them with rnndescent and uwot first.

@Yunuuuu
Copy link
Author

Yunuuuu commented Apr 25, 2025

Thanks — I gave it a try, and it seems that the neighbors from rnndescent alone don’t fully account for the nice UMAP results. It looks like something else in the pipeline is also contributing to the final outcome. I’ll need to dig a bit deeper to figure out what’s going on under the hood.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants