Reaching out to mentors for project 5. "OpenVINO AI PC Model Training Kit" GSOC' 2025 #29389

leoheim · 2025-03-10T19:06:29Z

leoheim
Mar 10, 2025

Hello Shivam and Aishwarye,

I hope you’re both doing well! My name is Leonardo, and I’m very interested in contributing to the OpenVINO AI PC Model Training Kit project as part of GSoC. I’m currently in my second year of Computer Science, and I’m very eager to learn and grow in the field of Deep Learning and model optimization. I’m particularly excited about this project because it aligns perfectly with my interest in AI model training and inference optimization. I see this as a great opportunity to learn from both of you and gain a deeper understanding of OpenVINO’s ecosystem while contributing meaningfully.

Recently, I’ve been actively contributing to OpenVINO, working on issues and familiarizing myself with its internals—understanding how model conversion, inference, and optimizations are handled.

Why This Project?
One of the most exciting aspects of this project for me is the idea of bringing training capabilities to OpenVINO, which would significantly lower the entry barrier for developers who want to leverage Intel hardware for both training and inference. I see a major opportunity here in improving the transition from model training to inference, making OpenVINO a more comprehensive ecosystem for ML practitioners.
To further prepare for this project, I would appreciate your guidance on a few key points:

What are the expected performance benefits of using OpenVINO for training on AI PCs compared to traditional GPU-based training?
Is there any specific prototype already being tested?
Will the training with OpenVINO be focused on specific types of models?
What level of flexibility do we want to provide developers in terms of hyperparameter tuning and custom training loops?
How do you envision the integration with PyTorch and TensorFlow? Will OpenVINO function as a seamless backend, or will modifications to their computation graphs be required?

I would love to hear your insights on how I can best prepare myself to contribute effectively to this project. If there are any resources, documentation, or existing research that you recommend, I’d be happy to dive into them. Additionally, if you have any advice on writing the application, I would greatly appreciate it.

Thank you for your time, and I look forward to your response!

Best regards,
Leonardo Heim

@adrianboguszewski @mlukasze Could you please help me connect with the mentors?

adrianboguszewski · 2025-03-12T20:06:58Z

adrianboguszewski
Mar 12, 2025
Collaborator

@sbasia @omeraish

1 reply

sbasia Mar 13, 2025

Hi @leoheim ,
Thanks for your interest. Please note that according to our org's rules for selection, you will have to complete the prerequisite task for proceeding further with the proposal.

For your questions, please find the response below:

(i)What are the expected performance benefits of using OpenVINO for training on AI PCs compared to traditional GPU-based training?
-> The performance benefits will be the scalability of training with general AI PCs and deploying it in general applications which otherwise becomes difficult to deployed locally if trained with traditional GPUs.

(ii)Is there any specific prototype already being tested?
->There are different prototypes from which we can use leverage the functionality to create the training wrapper. Although there is no direct framework having end to end training enabled on Openvino. So, the project can be designed in a flexible way as required.
(a)There are some training extensions made through (https://github.com/openvinotoolkit/training_extensions) but have been designed to kind of use and select specific pre-trained models based on the input datasets.
(b)Then, there is pytorch ov compile conversion(https://docs.openvino.ai/2025/openvino-workflow/torch-compile.html) which compiles the pre-trained model to utilize the openvino backend .
(c)We also have (https://github.com/openvinotoolkit/nncf/blob/develop/docs/usage/training_time_compression/quantization_aware_training/Usage.md) which helps in quantization a pre-trained model assuming the training pipeline is already created.

(iii)Will the training with OpenVINO be focused on specific types of models?
-> Currently, the plan is to start with the kind of ML/DL algorithms having the most usage and then leverage those to other algorithms as well.
(iv)What level of flexibility do we want to provide developers in terms of hyperparameter tuning and custom training loops?
-> Currently, hyperparameter tuning is done during training and then quantization is done during inferencing with Openvino. If quantization is enabled during training only, the weights during every epoch can be quantized and as such will improve the hyperparameter tuning and will simultaneously reduce the training batches required.
(v)How do you envision the integration with PyTorch and TensorFlow? Will OpenVINO function as a seamless backend, or will modifications to their computation graphs be required?
->This is something which need to analyze based on current functionality openvino supports. Ideally, the goal is a seamless backend but there will challenges to it and based on the computational support what openvino already has, we will be planning to design the training kit.

Please feel free to ask any further questions you have.

leoheim · 2025-03-14T17:59:32Z

leoheim
Mar 14, 2025
Author

Hello @sbasia ,

Thank you for the detailed response! Based on what you've shared, I’ve been thinking about potential directions for the training wrapper and have a few questions and ideas that might help refine the project.

Given that OpenVINO Training Extensions already provide a structured way to fine-tune models, would it make sense to integrate parts of its auto-configuration capabilities into this project? This could simplify model selection and hyperparameter tuning for users working with AI PCs. Alternatively, should we aim for a more flexible approach where users can define their own configurations from scratch?

For quantization, since NNCF already supports Quantization-Aware Training (QAT), one possible approach could be enabling quantization during training rather than post-training, reducing the number of retraining steps needed. Additionally, since OpenVINO focuses heavily on inference optimization, have there been any discussions on implementing mixed-precision training to improve efficiency on AI PCs?

Regarding integration with PyTorch/TensorFlow, would using torch.compile as a backend for training be a viable approach, similar to how OpenVINO optimizes inference? If so, how do you see this affecting training loops and backpropagation efficiency? If torch.compile is not the ideal path, do you think adapting lower-level OpenVINO operations for training is the better approach?

If we assume that OpenVINO will serve as a seamless backend for training, this presents exciting opportunities but also some challenges that need to be addressed. Here are some ideas that could help optimize this integration:

Extending Torch.compile for Training:

Currently, torch.compile allows for inference optimization using OpenVINO as a backend. Could this be extended to cover backpropagation and weight updates as well? This might require adapting gradient operations and potentially modifying TorchInductor to support Intel XPU-optimized kernels.

Multi-Device Training (Heterogeneous Execution):

If OpenVINO will be a native backend for training, it would be interesting to explore distributed execution across CPUs, iGPUs, and dGPUs, depending on the AI PC’s hardware configuration. Is this approach being considered, or is the initial focus primarily on CPU-based training?

Reducing Overhead in Framework Conversion:

One major challenge could be ensuring full compatibility between PyTorch/TensorFlow and OpenVINO during training. If OpenVINO manages both forward and backward computations, how will it handle custom activation functions, specialized layers, or unsupported operators? Will there be a fallback mechanism for hybrid execution with PyTorch/TensorFlow?

If the goal is to provide a seamless backend for training, it seems that the architecture will need to handle quantization, multi-device support, optimized backpropagation, and full integration with PyTorch/TensorFlow. Among these, which areas do you see as the highest priority for the initial design?

And lastly, I apologize if these are too many questions—I just want to make sure I fully understand the project’s direction. Thanks again for your time and insights!
Also, I’d love to ensure my GSoC application aligns well with the project goals. If you have any advice on structuring it, or if you’re open to reviewing it before submission, that would be greatly appreciated! I really appreciate your time!!

Looking forward to your thoughts.

Best regards,
Leonardo Heim

0 replies

SHIVAMBASIA · 2025-03-18T21:10:25Z

SHIVAMBASIA
Mar 18, 2025

Hi @leoheim,
Thanks for your questions. I appreciate your enthusiasm in understanding the project idea in detail.

Please find my response to your questions:
(i)OpenVINO Training extensions: We definitely want to utilize the auto-configuration capabilities of the training extensions in the project as much as possible but note that these are designed to kind of use and select specific pre-trained models based on the input datasets. So, we can pick one of the pre-trained model extensions to start with and utilize the structure to convert it to a training wrapper. If there is still bandwidth left at the end of the project timeline, functionality can be created to let users define their own configurations from scratch.

(ii)Quantization: You are right about the quantization step approach during training, which I explained in my earlier response. We definitely want to enable that feature in the wrapper. There is also mixed-precision quantization support already enabled on PyTorch through OpenVINO. (https://github.com/openvinotoolkit/nncf/blob/develop/docs/usage/training_time_compression/other_algorithms/LegacyQuantization.md#mixed-precision-quantization).

(iii)PyTorch,Compile: There will surely be impacts with integrating the PyTorch.compile functionality for the seamless backend but we want to use that approach at the beginning and try modifying torch inductors to start supporting some most common gradient operations. We can definitely increase the support once we have a working design for one/two models.

(iv)Multi-Device Training: Distributed training between different XPUs is a plus point to have but is optional for part of this project.

(v)Reducing Overhead in Framework Conversion: Full compatibility between PyTorch/TensorFlow and OpenVINO may not be possible in the current project timeline, so the idea is to enable full compatibility for the layers, activation functions, and operators used in 1-2 popular models at the beginning and then gradually increase the support for other activation functions and operators. There can be a fallback mechanism for unsupported activation functions, layers, and operators to directly utilize the PyTorch/TensorFlow framework.

Lastly, regarding your request to review the application before submission, I need to check with my program coordinators if I am authorized to do it. I will check back with them and get back to you. But, please continue to add points and structure your application, and ensure you have completed the pre-requisite as well.

0 replies

leoheim · 2025-03-19T23:52:36Z

leoheim
Mar 19, 2025
Author

Hello @SHIVAMBASIA,

Thank you for your detailed responses, they've greatly clarified several points for me.

I'm currently working on my GSoC application. I'm excited to further explore the concepts and techniques related to this project, and I'm actively deepening my understanding of how to effectively utilize PyTorch, scikit-learn, and TensorFlow for it. I eagerly await your update regarding the application review.

Thanks again for your valuable guidance!

Best regards,
Leonardo

1 reply

sbasia Mar 24, 2025

Hi @leoheim, we can review the draft proposal before submission.
Please use the below email for further communication. Let us know how the application/proposal is shaping up and always check the deadlines.
[email protected]
[email protected]

mohame54 · 2025-03-21T09:43:31Z

mohame54
Mar 21, 2025

Dear Mentors,

I hope you are both doing well.
My name is Mohammed, and I'm very interested in contributing to this project by implementing the proposed wrapper or feature that can help others leverage OpenVINO during training. I'm excited about the potential outcomes and look forward to the opportunity to collaborate with you both on this innovative initiative.

Before diving in further, I’d like to thank you all @leoheim @SHIVAMBASIA @adrianboguszewski, the previous conversations have really helped clarify some of the more vague aspects of the project.

I had a few technical questions around framework compatibility. Specifically, will we be implementing a custom backend or extensions for PyTorch and TensorFlow? For instance, I came across this PyTorch C++ extension tutorial, which might be relevant to how we structure integration.

From previous discussions, I understand that the focus will be on optimizing operations commonly used in widely adopted models. I’d love to understand more about which ops we plan to start with, and how deep the integration with each framework (especially PyTorch and TensorFlow) is expected to be.

To share a concrete example and ensure my understanding aligns with the project’s direction, here’s a basic C++ prototype where I create a torch::extension that performs a MatMul (with optional bias)
using OpenVINO:

#include <torch/extension.h>
#include <openvino/openvino.hpp>
#include <memory>
#include <iostream>

// Global compiled model (cached once per config)
std::shared_ptr<ov::CompiledModel> compiled_model;
ov::Core core;

template <typename T>
ov::element::Type infer_ov_type();

template <>
ov::element::Type infer_ov_type<float>() { return ov::element::f32; }

template <>
ov::element::Type infer_ov_type<double>() { return ov::element::f64; }

template <>
ov::element::Type infer_ov_type<at::BFloat16>() { return ov::element::bf16; }

template <typename T>
void initialize_openvino_model(int input_dim, int output_dim, bool use_bias) {
    ov::element::Type ov_dtype = infer_ov_type<T>();

    // Dynamic batch size (first dimension is dynamic)
    ov::PartialShape shape_A{ov::Dimension::dynamic(), static_cast<size_t>(input_dim)};
    ov::PartialShape shape_B{static_cast<size_t>(input_dim), static_cast<size_t>(output_dim)};
    ov::PartialShape shape_bias{1, static_cast<size_t>(output_dim)};

    auto input_A = std::make_shared<ov::op::v0::Parameter>(ov_dtype, shape_A);
    auto input_B = std::make_shared<ov::op::v0::Parameter>(ov_dtype, shape_B);

    std::shared_ptr<ov::Node> output_node;
    std::shared_ptr<ov::op::v0::Parameter> bias;

    if (use_bias) {
        bias = std::make_shared<ov::op::v0::Parameter>(ov_dtype, shape_bias);
        auto matmul = std::make_shared<ov::op::v0::MatMul>(input_A, input_B);
        output_node = std::make_shared<ov::op::v1::Add>(matmul, bias);
    } else {
        output_node = std::make_shared<ov::op::v0::MatMul>(input_A, input_B);
    }

    auto model = std::make_shared<ov::Model>(ov::OutputVector{output_node}, ov::ParameterVector{input_A, input_B});
    if (use_bias) {
        model->add_parameters({bias});
    }

    compiled_model = std::make_shared<ov::CompiledModel>(core.compile_model(model, "CPU"));
}


template <typename T>
torch::Tensor openvino_matmul(torch::Tensor a, torch::Tensor b, c10::optional<torch::Tensor> bias_opt) {
    using DataType = T;

    int batch_size = a.size(0);
    int input_dim = a.size(1);
    int output_dim = b.size(1);
    bool use_bias = bias_opt.has_value();


    if (!compiled_model) {
        initialize_openvino_model<DataType>(input_dim, output_dim, use_bias);
    }

    ov::element::Type ov_dtype = infer_ov_type<DataType>();
    ov::Tensor tensor_A(ov_dtype, {static_cast<size_t>(batch_size), static_cast<size_t>(input_dim)}, a.data_ptr<DataType>());
    ov::Tensor tensor_B(ov_dtype, {static_cast<size_t>(input_dim), static_cast<size_t>(output_dim)}, b.data_ptr<DataType>());

    auto infer_request = compiled_model->create_infer_request();
    infer_request.set_input_tensor(0, tensor_A);
    infer_request.set_input_tensor(1, tensor_B);

    if (use_bias) {
        torch::Tensor bias = bias_opt.value();
        ov::Tensor tensor_bias(ov_dtype, {1, static_cast<size_t>(output_dim)}, bias.data_ptr<DataType>());
        infer_request.set_input_tensor(2, tensor_bias);
    }

    infer_request.infer();

    ov::Tensor output_tensor = infer_request.get_output_tensor(0);

    torch::Tensor output = torch::empty({batch_size, output_dim}, a.options());
    std::memcpy(output.data_ptr<DataType>(), output_tensor.data<DataType>(), output_tensor.get_byte_size());

    return output;
}


// Python bindings
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
    m.def("openvino_matmul", &openvino_matmul<float>, "Optimized MatMul with optional Bias using OpenVINO (float)");
}

With this setup, we could expose the OpenVINO-optimized operation via a Python binding and use it directly within any nn.Module in PyTorch. This approach seems promising for prototyping operator-level acceleration.

Please let me know your thoughts particularly if this direction aligns with your vision for the wrapper. I’m eager to learn more and contribute meaningfully.

2 replies

mohame54 Mar 21, 2025

and forgive me for sharing my code here 😄

sbasia Mar 21, 2025

Hi @mohame54 , Thanks for sharing your input ideas. Your approach seems to be useful but will still require some tweaks to enable support for a full-fledged wrapper. We definitely wanted to start with PyTorch integration as a good amount of support with Intel XPU is already present. As I explained earlier, the idea is to enable compatibility with the simplest operators/activation functions at the beginning and then gradually increase the complexity. Also, we want to utilize as much support which is already present rather than trying to create everything from scratch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reaching out to mentors for project 5. "OpenVINO AI PC Model Training Kit" GSOC' 2025 #29389

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 4 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Reaching out to mentors for project 5. "OpenVINO AI PC Model Training Kit" GSOC' 2025 #29389

leoheim Mar 10, 2025

Replies: 5 comments · 4 replies

adrianboguszewski Mar 12, 2025 Collaborator

sbasia Mar 13, 2025

leoheim Mar 14, 2025 Author

SHIVAMBASIA Mar 18, 2025

leoheim Mar 19, 2025 Author

sbasia Mar 24, 2025

mohame54 Mar 21, 2025

mohame54 Mar 21, 2025

sbasia Mar 21, 2025

leoheim
Mar 10, 2025

Replies: 5 comments 4 replies

adrianboguszewski
Mar 12, 2025
Collaborator

leoheim
Mar 14, 2025
Author

SHIVAMBASIA
Mar 18, 2025

leoheim
Mar 19, 2025
Author

mohame54
Mar 21, 2025