-
Notifications
You must be signed in to change notification settings - Fork 2.6k
[CPU] Enable Weightless models cache #29304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
8582cc3
to
ea30e62
Compare
ab254a4
to
ea0e3f7
Compare
@@ -41,6 +41,10 @@ class ModelDeserializer { | |||
|
|||
void operator>>(std::shared_ptr<ov::Model>& model); | |||
|
|||
void set_weights_path(std::string& weights_path) { | |||
m_weights_path = weights_path; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::shared_ptr<char[]> new_buf(new char[actual_size]); | ||
data = new_buf.get(); | ||
weights_buf = std::make_shared<ov::SharedBuffer<std::shared_ptr<char[]>>>(data, actual_size, new_buf); | ||
convert_dt(el_type, original_dt, data, m_weights->get_ptr<char>() + offset, el_num); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we perform constants conversion directly in IR FE via suboptimal way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need to get converted values during nodes creation, otherwise some nodes could not pass 'validate_and_infer_types' and graph compilation fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that constants conversion from one type to another is responsibility of IR reader.
Should original saving logic implement such conversion steps as constant subgraphs which are read as is?
Later, plugin can fold such subgraphs to get constants in desired precision.
Or at least original_precision
should be applied on plugin level with faster functions than manual conversions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @ilya-lavrenov , the de-serializer just should read xml and additional convert should not be there. The plugin should apply any conversion if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do understand your concern, but precision forcing may lead to precision propagation. That will modify the graph that the plugin saved before and will require transformations pipeline. That makes model caching senseless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean not modify graph but use correct weights and apply only conversion only on original weight if required but not in (de)serialization part
1569b18
to
e5800b0
Compare
e5800b0
to
96d3c54
Compare
8c52b1e
to
2327eb1
Compare
2327eb1
to
fe5ebe7
Compare
This PR will be closed in a week because of 2 weeks of no activity. |
This PR was closed because it has been stalled for 2 week with no activity. |
Details:
Tickets: