Reaching out to mentors for project 5. "OpenVINO AI PC Model Training Kit" GSOC' 2025 #29389
Replies: 5 comments 4 replies
-
Hello @sbasia , Thank you for the detailed response! Based on what you've shared, I’ve been thinking about potential directions for the training wrapper and have a few questions and ideas that might help refine the project. Given that OpenVINO Training Extensions already provide a structured way to fine-tune models, would it make sense to integrate parts of its auto-configuration capabilities into this project? This could simplify model selection and hyperparameter tuning for users working with AI PCs. Alternatively, should we aim for a more flexible approach where users can define their own configurations from scratch? For quantization, since NNCF already supports Quantization-Aware Training (QAT), one possible approach could be enabling quantization during training rather than post-training, reducing the number of retraining steps needed. Additionally, since OpenVINO focuses heavily on inference optimization, have there been any discussions on implementing mixed-precision training to improve efficiency on AI PCs? Regarding integration with PyTorch/TensorFlow, would using torch.compile as a backend for training be a viable approach, similar to how OpenVINO optimizes inference? If so, how do you see this affecting training loops and backpropagation efficiency? If torch.compile is not the ideal path, do you think adapting lower-level OpenVINO operations for training is the better approach? If we assume that OpenVINO will serve as a seamless backend for training, this presents exciting opportunities but also some challenges that need to be addressed. Here are some ideas that could help optimize this integration: Extending Torch.compile for Training: Currently, torch.compile allows for inference optimization using OpenVINO as a backend. Could this be extended to cover backpropagation and weight updates as well? This might require adapting gradient operations and potentially modifying TorchInductor to support Intel XPU-optimized kernels. Multi-Device Training (Heterogeneous Execution): If OpenVINO will be a native backend for training, it would be interesting to explore distributed execution across CPUs, iGPUs, and dGPUs, depending on the AI PC’s hardware configuration. Is this approach being considered, or is the initial focus primarily on CPU-based training? Reducing Overhead in Framework Conversion: One major challenge could be ensuring full compatibility between PyTorch/TensorFlow and OpenVINO during training. If OpenVINO manages both forward and backward computations, how will it handle custom activation functions, specialized layers, or unsupported operators? Will there be a fallback mechanism for hybrid execution with PyTorch/TensorFlow? If the goal is to provide a seamless backend for training, it seems that the architecture will need to handle quantization, multi-device support, optimized backpropagation, and full integration with PyTorch/TensorFlow. Among these, which areas do you see as the highest priority for the initial design? And lastly, I apologize if these are too many questions—I just want to make sure I fully understand the project’s direction. Thanks again for your time and insights! Looking forward to your thoughts. Best regards, |
Beta Was this translation helpful? Give feedback.
-
Hi @leoheim, Please find my response to your questions: (ii)Quantization: You are right about the quantization step approach during training, which I explained in my earlier response. We definitely want to enable that feature in the wrapper. There is also mixed-precision quantization support already enabled on PyTorch through OpenVINO. (https://github.com/openvinotoolkit/nncf/blob/develop/docs/usage/training_time_compression/other_algorithms/LegacyQuantization.md#mixed-precision-quantization). (iii)PyTorch,Compile: There will surely be impacts with integrating the PyTorch.compile functionality for the seamless backend but we want to use that approach at the beginning and try modifying torch inductors to start supporting some most common gradient operations. We can definitely increase the support once we have a working design for one/two models. (iv)Multi-Device Training: Distributed training between different XPUs is a plus point to have but is optional for part of this project. (v)Reducing Overhead in Framework Conversion: Full compatibility between PyTorch/TensorFlow and OpenVINO may not be possible in the current project timeline, so the idea is to enable full compatibility for the layers, activation functions, and operators used in 1-2 popular models at the beginning and then gradually increase the support for other activation functions and operators. There can be a fallback mechanism for unsupported activation functions, layers, and operators to directly utilize the PyTorch/TensorFlow framework. Lastly, regarding your request to review the application before submission, I need to check with my program coordinators if I am authorized to do it. I will check back with them and get back to you. But, please continue to add points and structure your application, and ensure you have completed the pre-requisite as well. |
Beta Was this translation helpful? Give feedback.
-
Hello @SHIVAMBASIA, Thank you for your detailed responses, they've greatly clarified several points for me. I'm currently working on my GSoC application. I'm excited to further explore the concepts and techniques related to this project, and I'm actively deepening my understanding of how to effectively utilize PyTorch, scikit-learn, and TensorFlow for it. I eagerly await your update regarding the application review. Thanks again for your valuable guidance! Best regards, |
Beta Was this translation helpful? Give feedback.
-
Dear Mentors, I hope you are both doing well. Before diving in further, I’d like to thank you all @leoheim @SHIVAMBASIA @adrianboguszewski, the previous conversations have really helped clarify some of the more vague aspects of the project. I had a few technical questions around framework compatibility. Specifically, will we be implementing a custom backend or extensions for PyTorch and TensorFlow? For instance, I came across this PyTorch C++ extension tutorial, which might be relevant to how we structure integration. From previous discussions, I understand that the focus will be on optimizing operations commonly used in widely adopted models. I’d love to understand more about which ops we plan to start with, and how deep the integration with each framework (especially PyTorch and TensorFlow) is expected to be. To share a concrete example and ensure my understanding aligns with the project’s direction, here’s a basic C++ prototype where I create a torch::extension that performs a MatMul (with optional bias) #include <torch/extension.h>
#include <openvino/openvino.hpp>
#include <memory>
#include <iostream>
// Global compiled model (cached once per config)
std::shared_ptr<ov::CompiledModel> compiled_model;
ov::Core core;
template <typename T>
ov::element::Type infer_ov_type();
template <>
ov::element::Type infer_ov_type<float>() { return ov::element::f32; }
template <>
ov::element::Type infer_ov_type<double>() { return ov::element::f64; }
template <>
ov::element::Type infer_ov_type<at::BFloat16>() { return ov::element::bf16; }
template <typename T>
void initialize_openvino_model(int input_dim, int output_dim, bool use_bias) {
ov::element::Type ov_dtype = infer_ov_type<T>();
// Dynamic batch size (first dimension is dynamic)
ov::PartialShape shape_A{ov::Dimension::dynamic(), static_cast<size_t>(input_dim)};
ov::PartialShape shape_B{static_cast<size_t>(input_dim), static_cast<size_t>(output_dim)};
ov::PartialShape shape_bias{1, static_cast<size_t>(output_dim)};
auto input_A = std::make_shared<ov::op::v0::Parameter>(ov_dtype, shape_A);
auto input_B = std::make_shared<ov::op::v0::Parameter>(ov_dtype, shape_B);
std::shared_ptr<ov::Node> output_node;
std::shared_ptr<ov::op::v0::Parameter> bias;
if (use_bias) {
bias = std::make_shared<ov::op::v0::Parameter>(ov_dtype, shape_bias);
auto matmul = std::make_shared<ov::op::v0::MatMul>(input_A, input_B);
output_node = std::make_shared<ov::op::v1::Add>(matmul, bias);
} else {
output_node = std::make_shared<ov::op::v0::MatMul>(input_A, input_B);
}
auto model = std::make_shared<ov::Model>(ov::OutputVector{output_node}, ov::ParameterVector{input_A, input_B});
if (use_bias) {
model->add_parameters({bias});
}
compiled_model = std::make_shared<ov::CompiledModel>(core.compile_model(model, "CPU"));
}
template <typename T>
torch::Tensor openvino_matmul(torch::Tensor a, torch::Tensor b, c10::optional<torch::Tensor> bias_opt) {
using DataType = T;
int batch_size = a.size(0);
int input_dim = a.size(1);
int output_dim = b.size(1);
bool use_bias = bias_opt.has_value();
if (!compiled_model) {
initialize_openvino_model<DataType>(input_dim, output_dim, use_bias);
}
ov::element::Type ov_dtype = infer_ov_type<DataType>();
ov::Tensor tensor_A(ov_dtype, {static_cast<size_t>(batch_size), static_cast<size_t>(input_dim)}, a.data_ptr<DataType>());
ov::Tensor tensor_B(ov_dtype, {static_cast<size_t>(input_dim), static_cast<size_t>(output_dim)}, b.data_ptr<DataType>());
auto infer_request = compiled_model->create_infer_request();
infer_request.set_input_tensor(0, tensor_A);
infer_request.set_input_tensor(1, tensor_B);
if (use_bias) {
torch::Tensor bias = bias_opt.value();
ov::Tensor tensor_bias(ov_dtype, {1, static_cast<size_t>(output_dim)}, bias.data_ptr<DataType>());
infer_request.set_input_tensor(2, tensor_bias);
}
infer_request.infer();
ov::Tensor output_tensor = infer_request.get_output_tensor(0);
torch::Tensor output = torch::empty({batch_size, output_dim}, a.options());
std::memcpy(output.data_ptr<DataType>(), output_tensor.data<DataType>(), output_tensor.get_byte_size());
return output;
}
// Python bindings
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def("openvino_matmul", &openvino_matmul<float>, "Optimized MatMul with optional Bias using OpenVINO (float)");
} With this setup, we could expose the OpenVINO-optimized operation via a Python binding and use it directly within any nn.Module in PyTorch. This approach seems promising for prototyping operator-level acceleration. Please let me know your thoughts particularly if this direction aligns with your vision for the wrapper. I’m eager to learn more and contribute meaningfully. |
Beta Was this translation helpful? Give feedback.
-
Hello Shivam and Aishwarye,
I hope you’re both doing well! My name is Leonardo, and I’m very interested in contributing to the OpenVINO AI PC Model Training Kit project as part of GSoC. I’m currently in my second year of Computer Science, and I’m very eager to learn and grow in the field of Deep Learning and model optimization. I’m particularly excited about this project because it aligns perfectly with my interest in AI model training and inference optimization. I see this as a great opportunity to learn from both of you and gain a deeper understanding of OpenVINO’s ecosystem while contributing meaningfully.
Recently, I’ve been actively contributing to OpenVINO, working on issues and familiarizing myself with its internals—understanding how model conversion, inference, and optimizations are handled.
Why This Project?
One of the most exciting aspects of this project for me is the idea of bringing training capabilities to OpenVINO, which would significantly lower the entry barrier for developers who want to leverage Intel hardware for both training and inference. I see a major opportunity here in improving the transition from model training to inference, making OpenVINO a more comprehensive ecosystem for ML practitioners.
To further prepare for this project, I would appreciate your guidance on a few key points:
I would love to hear your insights on how I can best prepare myself to contribute effectively to this project. If there are any resources, documentation, or existing research that you recommend, I’d be happy to dive into them. Additionally, if you have any advice on writing the application, I would greatly appreciate it.
Thank you for your time, and I look forward to your response!
Best regards,
Leonardo Heim
@adrianboguszewski @mlukasze Could you please help me connect with the mentors?
Beta Was this translation helpful? Give feedback.
All reactions