Skip to content

Installing llm-sentence-transformers degrades startup time #915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ricardog opened this issue Apr 12, 2025 · 1 comment
Open

Installing llm-sentence-transformers degrades startup time #915

ricardog opened this issue Apr 12, 2025 · 1 comment

Comments

@ricardog
Copy link

On my Mac, after installing the llm-sentence-transformers plugins, startup of the llm process slowed down significantly. I measured startup time by running

time ./my-cli.py

Where the script is simply

❯ cat my-cli.py
#!/usr/bin/env python3

import llm.cli

Without the plugin startup is about 0.5s. With the plugin it is about 4s. The plugin brings in a lot of packages that are not required for startup.

I was able to speed up startup with the package by moving the import of SentenceTransformer to the place where it is needed, i.e.

class SentenceTransformerModel(llm.EmbeddingModel):
    def __init__(self, model_id, model_name, trust_remote_code):
        self.model_id = model_id
        self.model_name = model_name
        self.trust_remote_code = trust_remote_code
        self._model = None

    def embed_batch(self, texts):
        from sentence_transformers import SentenceTransformer
        with disable_logging():

Installing llm_mlx also degrades startup time for the same reason; it load an enormous amount of packages. I haven't looked at how the move imports in llm_mlx to speedup startup.

@ricardog
Copy link
Author

Here's a diff to speedup startup with the llm_mlx plugin installed.

< import mlx.core as mx
< from mlx_lm import load, stream_generate
< from mlx_lm.sample_utils import make_sampler
202a200
>         from mlx_lm import load, stream_generate
207a206,207
>         from mlx_lm.sample_utils import make_sampler
>
247a248
>             import mlx.core as mx

Also, I just realized the plugins live in separate repos so perhaps this issue should move. On the other hand, it would be nice if there was a way for llm to limit the time taken for plugin import.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant