[CPU] Enable Weightless models cache #29304

nshchego · 2025-03-06T04:09:47Z

Details:

CPU plugin. Minimizing the size of cached blob by reusing weights from the original bin file.
Some API was extended to pass original weights
IR serializer and deserializer were modified to handle both weights sources due to the CPU plugin uses them to write/read cache file.

Tickets:

161826

ilya-lavrenov · 2025-03-06T06:39:16Z

src/plugins/intel_cpu/src/utils/serialize.hpp

@@ -41,6 +41,10 @@ class ModelDeserializer {

    void operator>>(std::shared_ptr<ov::Model>& model);

+    void set_weights_path(std::string& weights_path) {
+        m_weights_path = weights_path;


please, note that case when model is compiled as compile_model(ov::Model) should also be supported (see PR #29107, NPU & GPU work is in progress) via hint::model_ptr

src/frontends/ir/src/ir_deserializer.cpp

ilya-lavrenov · 2025-03-06T06:43:15Z

src/frontends/ir/src/ir_deserializer.cpp

+                                std::shared_ptr<char[]> new_buf(new char[actual_size]);
+                                data = new_buf.get();
+                                weights_buf = std::make_shared<ov::SharedBuffer<std::shared_ptr<char[]>>>(data, actual_size, new_buf);
+                                convert_dt(el_type, original_dt, data, m_weights->get_ptr<char>() + offset, el_num);


do we perform constants conversion directly in IR FE via suboptimal way?

Yes, we need to get converted values during nodes creation, otherwise some nodes could not pass 'validate_and_infer_types' and graph compilation fails.

I don't think that constants conversion from one type to another is responsibility of IR reader.
Should original saving logic implement such conversion steps as constant subgraphs which are read as is?

Later, plugin can fold such subgraphs to get constants in desired precision.

Or at least original_precision should be applied on plugin level with faster functions than manual conversions.

Agree with @ilya-lavrenov , the de-serializer just should read xml and additional convert should not be there. The plugin should apply any conversion if required.

I do understand your concern, but precision forcing may lead to precision propagation. That will modify the graph that the plugin saved before and will require transformations pipeline. That makes model caching senseless.

I mean not modify graph but use correct weights and apply only conversion only on original weight if required but not in (de)serialization part

src/plugins/intel_cpu/src/plugin.cpp

src/inference/src/dev/core_impl.cpp

src/frontends/ir/src/ir_deserializer.cpp

github-actions · 2025-05-14T00:26:56Z

This PR will be closed in a week because of 2 weeks of no activity.

github-actions · 2025-05-21T00:27:28Z

This PR was closed because it has been stalled for 2 week with no activity.

nshchego force-pushed the cpu/weightless_cache branch from 8582cc3 to ea30e62 Compare March 6, 2025 04:18

github-actions bot removed the category: samples OpenVINO Runtime Samples label Mar 6, 2025

nshchego force-pushed the cpu/weightless_cache branch 3 times, most recently from ab254a4 to ea0e3f7 Compare March 6, 2025 04:40

ilya-lavrenov reviewed Mar 6, 2025

View reviewed changes

nshchego force-pushed the cpu/weightless_cache branch 3 times, most recently from 1569b18 to e5800b0 Compare March 12, 2025 13:39

nshchego marked this pull request as ready for review March 13, 2025 09:44

nshchego requested review from a team as code owners March 13, 2025 09:44

nshchego requested review from itikhono and removed request for a team March 13, 2025 09:44

praasz self-assigned this Mar 13, 2025

t-jankowski reviewed Mar 13, 2025

View reviewed changes

src/inference/src/dev/core_impl.cpp Outdated Show resolved Hide resolved

src/frontends/ir/src/ir_deserializer.cpp Outdated Show resolved Hide resolved

nshchego force-pushed the cpu/weightless_cache branch from e5800b0 to 96d3c54 Compare March 17, 2025 08:21

nshchego requested a review from a team as a code owner March 17, 2025 08:21

nshchego force-pushed the cpu/weightless_cache branch 16 times, most recently from 8c52b1e to 2327eb1 Compare April 28, 2025 10:14

nshchego added 5 commits April 29, 2025 22:08

[CPU] Enable Weightless models cache

d2415e0

Fixes as per comments

5aacbc6

Use optimized convert from the Reference lib

3d5ecae

Eliminate weightless attr

e6654ed

Fix for Serializer

fe5ebe7

nshchego force-pushed the cpu/weightless_cache branch from 2327eb1 to fe5ebe7 Compare April 29, 2025 18:09

github-actions bot added the Stale label May 14, 2025

github-actions bot closed this May 21, 2025

nshchego reopened this May 26, 2025

github-actions bot removed the Stale label May 27, 2025

praasz added this to the 2025.3 milestone May 28, 2025

praasz added the no_stale Do not mark as stale label May 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CPU] Enable Weightless models cache #29304

[CPU] Enable Weightless models cache #29304

nshchego commented Mar 6, 2025 •

edited

Loading

Uh oh!

ilya-lavrenov Mar 6, 2025

Uh oh!

Uh oh!

ilya-lavrenov Mar 6, 2025

Uh oh!

nshchego Mar 13, 2025

Uh oh!

ilya-lavrenov Mar 13, 2025 •

edited

Loading

Uh oh!

praasz Mar 17, 2025

Uh oh!

nshchego Mar 19, 2025

Uh oh!

praasz Mar 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented May 14, 2025

Uh oh!

github-actions bot commented May 21, 2025

Uh oh!

Uh oh!

[CPU] Enable Weightless models cache #29304

Are you sure you want to change the base?

[CPU] Enable Weightless models cache #29304

Conversation

nshchego commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Details:

Tickets:

Uh oh!

ilya-lavrenov Mar 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ilya-lavrenov Mar 6, 2025

Choose a reason for hiding this comment

Uh oh!

nshchego Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

ilya-lavrenov Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

praasz Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

nshchego Mar 19, 2025

Choose a reason for hiding this comment

Uh oh!

praasz Mar 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented May 14, 2025

Uh oh!

github-actions bot commented May 21, 2025

Uh oh!

Uh oh!

nshchego commented Mar 6, 2025 •

edited

Loading

ilya-lavrenov Mar 13, 2025 •

edited

Loading