Skip to content

Adding rocSOLVER support for LAPACK domain with hip backend #208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Nov 30, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#===============================================================================
# Copyright 2020-2021 Intel Corporation
# Copyright 2020-2022 Intel Corporation
# Copyright (C) 2022 Heidelberg University, Engineering Mathematics and Computing Lab (EMCL) and Computing Centre (URZ)
#
# Licensed under the Apache License, Version 2.0 (the "License");
Expand Down Expand Up @@ -53,6 +53,7 @@ option(ENABLE_CUSOLVER_BACKEND "" OFF)
option(ENABLE_ROCBLAS_BACKEND "" OFF)
option(ENABLE_CURAND_BACKEND "" OFF)
option(ENABLE_ROCRAND_BACKEND "" OFF)
option(ENABLE_ROCSOLVER_BACKEND "" OFF)
option(ENABLE_NETLIB_BACKEND "" OFF)
set(ONEMKL_SYCL_IMPLEMENTATION "dpc++" CACHE STRING "Name of the SYCL compiler")
set(HIP_TARGETS "" CACHE STRING "Target HIP architectures")
Expand All @@ -77,7 +78,8 @@ if(ENABLE_MKLCPU_BACKEND
endif()
if(ENABLE_MKLCPU_BACKEND
OR ENABLE_MKLGPU_BACKEND
OR ENABLE_CUSOLVER_BACKEND)
OR ENABLE_CUSOLVER_BACKEND
OR ENABLE_ROCSOLVER_BACKEND)
list(APPEND DOMAINS_LIST "lapack")
endif()
if(ENABLE_MKLCPU_BACKEND
Expand All @@ -94,7 +96,7 @@ if(CMAKE_CXX_COMPILER OR NOT ONEMKL_SYCL_IMPLEMENTATION STREQUAL "dpc++")
endif()
else()
if(ENABLE_CUBLAS_BACKEND OR ENABLE_CURAND_BACKEND OR ENABLE_ROCBLAS_BACKEND
OR ENABLE_ROCRAND_BACKEND)
OR ENABLE_ROCRAND_BACKEND OR ENABLE_ROCSOLVER_BACKEND)
set(CMAKE_CXX_COMPILER "clang++")
elseif(ENABLE_MKLGPU_BACKEND)
if(UNIX)
Expand Down Expand Up @@ -169,7 +171,7 @@ if(WIN32 AND ONEMKL_SYCL_IMPLEMENTATION STREQUAL "dpc++")
endif()

# Temporary disable sycl 2020 deprecations warnings for cuBLAS and cuSOLVER
if(ONEMKL_SYCL_IMPLEMENTATION STREQUAL "dpc++" AND (ENABLE_CUBLAS_BACKEND OR ENABLE_CUSOLVER_BACKEND))
if(ONEMKL_SYCL_IMPLEMENTATION STREQUAL "dpc++" AND (ENABLE_CUBLAS_BACKEND OR ENABLE_CUSOLVER_BACKEND OR ENABLE_ROCSOLVER_BACKEND))
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DSYCL2020_DISABLE_DEPRECATION_WARNINGS")
endif()

Expand Down
15 changes: 13 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,10 @@ oneMKL is part of [oneAPI](https://oneapi.io).
<td align="center"><a href="https://rocblas.readthedocs.io/en/rocm-4.5.2/"> AMD rocBLAS</a> for AMD GPU </td>
<td align="center">AMD GPU</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/ROCmSoftwarePlatform/rocSOLVER"> AMD rocSOLVER</a> for AMD GPU </td>
<td align="center">AMD GPU</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/ROCmSoftwarePlatform/rocRAND"> AMD rocRAND</a> for AMD GPU </td>
<td align="center">AMD GPU</td>
Expand Down Expand Up @@ -167,12 +171,12 @@ Supported domains: BLAS, LAPACK, RNG
</tr>
<tr >
<td align="center">AMD GPU</td>
<td align="center">AMD rocBLAS </td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table formatting looks off when viewing the README in your branch: https://github.com/srilekhainkulu99/oneMKL/tree/rocsolver_hip_support

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I'm looking forward to having rocSOLVER support added for LAPACK. =)

I have not tried running your changes, just did a visual code review with multiple comments throughout. Thanks!

Thank you for such a detailed feedback. I've addressed majority of the comments.
Do let us know if you have any further suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table formatting still appears off, please address this. https://github.com/srilekhainkulu99/oneMKL/tree/rocsolver_hip_support

<td align="center">AMD rocBLAS</td>
<td align="center">Dynamic, Static</td>
<td align="center">LLVM*, hipSYCL</td>
</tr>
<tr>
<td rowspan=3 align="center">LAPACK</td>
<td rowspan=4 align="center">LAPACK</td>
<td align="center">x86 CPU</td>
<td rowspan=2 align="center">Intel(R) oneAPI Math Kernel Library</td>
<td align="center">Dynamic, Static</td>
Expand All @@ -189,6 +193,12 @@ Supported domains: BLAS, LAPACK, RNG
<td align="center">Dynamic, Static</td>
<td align="center">LLVM*</td>
</tr>
<tr>
<td align="center">AMD GPU</td>
<td align="center">AMD rocSOLVER</td>
<td align="center">Dynamic, Static</td>
<td align="center">LLVM*</td>
</tr>
<tr>
<td rowspan=4 align="center">RNG</td>
<td align="center">x86 CPU</td>
Expand Down Expand Up @@ -421,6 +431,7 @@ Python | 3.6 or higher | No | *N/A* | *Pre-installed or Installed by user* | [PS
[NVIDIA CUDA SDK](https://developer.nvidia.com/cublas) | 10.2 | No | *N/A* | *Installed by user* |[End User License Agreement](https://docs.nvidia.com/cuda/eula/index.html)
[AMD rocBLAS](https://rocblas.readthedocs.io/en/rocm-4.5.2/) | 4.5 | No | *N/A* | *Installed by user* |[AMD License](https://github.com/ROCmSoftwarePlatform/rocBLAS/blob/develop/LICENSE.md)
[AMD rocRAND](https://github.com/ROCmSoftwarePlatform/rocRAND) | 5.1.0 | No | *N/A* | *Installed by user* |[AMD License](https://github.com/ROCmSoftwarePlatform/rocRAND/blob/develop/LICENSE.txt)
[AMD rocSOLVER](https://github.com/ROCmSoftwarePlatform/rocSOLVER) | 5.0.0 | No | *N/A* | *Installed by user* |[AMD License](https://github.com/ROCmSoftwarePlatform/rocRAND/blob/develop/LICENSE.txt)
[NETLIB LAPACK](https://www.netlib.org/) | 3.7.1 | Yes | conan-community | ~/.conan/data or $CONAN_USER_HOME/.conan/data | [BSD like license](http://www.netlib.org/lapack/LICENSE.txt)
[Sphinx](https://www.sphinx-doc.org/en/master/) | 2.4.4 | Yes | pip | ~/.local/bin (or similar user local directory) | [BSD License](https://github.com/sphinx-doc/sphinx/blob/3.x/LICENSE)

Expand Down
5 changes: 3 additions & 2 deletions cmake/FindCompiler.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@ if(is_dpcpp)
-fsycl-targets=nvptx64-nvidia-cuda -fsycl-unnamed-lambda)
list(APPEND UNIX_INTERFACE_LINK_OPTIONS
-fsycl-targets=nvptx64-nvidia-cuda)
elseif(ENABLE_ROCBLAS_BACKEND OR ENABLE_ROCRAND_BACKEND)
elseif(ENABLE_ROCBLAS_BACKEND OR ENABLE_ROCRAND_BACKEND
OR ENABLE_ROCSOLVER_BACKEND)
list(APPEND UNIX_INTERFACE_COMPILE_OPTIONS
-fsycl-targets=amdgcn-amd-amdhsa -fsycl-unnamed-lambda
-Xsycl-target-backend --offload-arch=${HIP_TARGETS})
Expand All @@ -47,7 +48,7 @@ if(is_dpcpp)
--offload-arch=${HIP_TARGETS})
endif()
if(ENABLE_CURAND_BACKEND OR ENABLE_CUSOLVER_BACKEND OR ENABLE_ROCBLAS_BACKEND
OR ENABLE_ROCRAND_BACKEND)
OR ENABLE_ROCRAND_BACKEND OR ENABLE_ROCSOLVER_BACKEND)
set_target_properties(ONEMKL::SYCL::SYCL PROPERTIES
INTERFACE_COMPILE_OPTIONS "${UNIX_INTERFACE_COMPILE_OPTIONS}"
INTERFACE_LINK_OPTIONS "${UNIX_INTERFACE_LINK_OPTIONS}"
Expand Down
41 changes: 41 additions & 0 deletions cmake/FindrocSOLVER.cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#==========================================================================
# Copyright 2022 Intel Corporation
#=========================================================================

if(NOT DEFINED HIP_PATH)
if(NOT DEFINED ENV{HIP_PATH})
set(HIP_PATH "/opt/rocm/hip" CACHE PATH "Path to which HIP has been installed")
else()
set(HIP_PATH $ENV{HIP_PATH} CACHE PATH "Path to which HIP has been installed")
endif()
endif()

set(CMAKE_MODULE_PATH "${HIP_PATH}/cmake" ${CMAKE_MODULE_PATH})
list(APPEND CMAKE_PREFIX_PATH
"${HIP_PATH}/lib/cmake"
"${HIP_PATH}/../lib/cmake"
"${HIP_PATH}/../lib/cmake/rocsolver")

find_package(HIP QUIET)
find_package(rocsolver REQUIRED)

# this is work around to avoid duplication half creation in both hip and SYCL
add_compile_definitions(HIP_NO_HALF)

find_package(Threads REQUIRED)

include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(rocSOLVER
REQUIRED_VARS
HIP_INCLUDE_DIRS
rocsolver_INCLUDE_DIR
rocsolver_LIBRARIES)

if(NOT TARGET ONEMKL::rocSOLVER::rocSOLVER)
add_library(ONEMKL::rocSOLVER::rocSOLVER SHARED IMPORTED)
set_target_properties(ONEMKL::rocSOLVER::rocSOLVER PROPERTIES
IMPORTED_LOCATION "${HIP_PATH}/../rocsolver/lib/librocsolver.so"
INTERFACE_INCLUDE_DIRECTORIES "${rocsolver_INCLUDE_DIR};${HIP_INCLUDE_DIRS};"
INTERFACE_LINK_LIBRARIES "Threads::Threads;hip::host;${rocsolver_LIBRARIES};")
endif()

13 changes: 13 additions & 0 deletions docs/building_the_project.rst
Original file line number Diff line number Diff line change
Expand Up @@ -457,6 +457,19 @@ With:
-DENABLE_ROCRAND_BACKEND=True \
-DTARGET_DOMAINS=rng

To build with the rocSOLVER backend instead simply replace:

.. code-block:: bash\

-DENABLE_ROCBLAS_BACKEND=True \
-DTARGET_DOMAINS=blas
With:

.. code-block:: bash

-DENABLE_ROCSOLVER_BACKEND=True \
-DTARGET_DOMAINS=lapack

**AMD GPU device architectures**

The device architecture can be retrieved via the ``rocminfo`` tool. The architecture will be displayed in the ``Name:`` row.
Expand Down
3 changes: 3 additions & 0 deletions examples/lapack/run_time_dispatching/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ endif()
if(ENABLE_CUSOLVER_BACKEND)
list(APPEND DEVICE_FILTERS "cuda:gpu")
endif()
if(ENABLE_ROCSOLVER_BACKEND)
list(APPEND DEVICE_FILTERS "hip:gpu")
endif()

message(STATUS "SYCL_DEVICE_FILTER will be set to the following value(s): [${DEVICE_FILTERS}] for run-time dispatching examples")

Expand Down
15 changes: 12 additions & 3 deletions examples/lapack/run_time_dispatching/getrs_usm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -121,11 +121,20 @@ void run_getrs_example(const sycl::device& device) {
sycl::malloc_shared<float>(getrf_scratchpad_size * sizeof(float), device, context);
float* getrs_scratchpad =
sycl::malloc_shared<float>(getrs_scratchpad_size * sizeof(float), device, context);
if (!dev_A || !dev_B || !dev_ipiv || !getrf_scratchpad) {
if (!dev_A || !dev_B || !dev_ipiv) {
throw std::runtime_error("Failed to allocate USM memory.");
}
// Skip checking getrs scratchpad memory allocation on cusolver because with cusolver backend getrs does not use scrachpad memory
if (device.is_cpu() || device.get_info<sycl::info::device::vendor_id>() != NVIDIA_ID) {
// Skip checking getrf scratchpad memory allocation on rocsolver because with rocsolver
// backend getrf does not use scrachpad memory
if (device.is_cpu() || device.get_info<sycl::info::device::vendor_id>() != AMD_ID) {
if (!getrf_scratchpad) {
throw std::runtime_error("Failed to allocate USM memory.");
}
}
// Skip checking getrs scratchpad memory allocation on cusolver/rocsolver because with
// cusolver/rocsolver backend getrs does not use scrachpad memory
if (device.is_cpu() || (device.get_info<sycl::info::device::vendor_id>() != NVIDIA_ID &&
device.get_info<sycl::info::device::vendor_id>() != AMD_ID)) {
if (!getrs_scratchpad) {
throw std::runtime_error("Failed to allocate USM memory.");
}
Expand Down
14 changes: 13 additions & 1 deletion include/oneapi/mkl/detail/backend_selector_predicates.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*******************************************************************************
* Copyright 2020-2021 Intel Corporation
* Copyright 2020-2022 Intel Corporation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -124,6 +124,18 @@ inline void backend_selector_precondition<backend::rocrand>(sycl::queue& queue)
#endif
}

template <>
inline void backend_selector_precondition<backend::rocsolver>(sycl::queue& queue) {
#ifndef ONEMKL_DISABLE_PREDICATES
unsigned int vendor_id =
static_cast<unsigned int>(queue.get_device().get_info<sycl::info::device::vendor_id>());
if (!(queue.get_device().is_gpu() && vendor_id == AMD_ID)) {
throw unsupported_device(
"", "backend_selector<backend::" + backend_map[backend::rocsolver] + ">",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but the spacing doesn't seem to match the spacing above it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like line 135 with the queue get device is still misaligned compared to line 122 above.

Copy link
Contributor

@TejaX-Alaghari TejaX-Alaghari Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran clang-format to fix the indentation of this section which differs from the format of the code with line #122

queue.get_device());
}
#endif
}
} // namespace mkl
} // namespace oneapi

Expand Down
5 changes: 3 additions & 2 deletions include/oneapi/mkl/detail/backends.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*******************************************************************************
* Copyright 2020-2021 Intel Corporation
* Copyright 2020-2022 Intel Corporation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -30,6 +30,7 @@ enum class backend {
mklcpu,
mklgpu,
cublas,
rocsolver,
cusolver,
curand,
netlib,
Expand All @@ -45,7 +46,7 @@ static backendmap backend_map = {
{ backend::cublas, "cublas" }, { backend::cusolver, "cusolver" },
{ backend::curand, "curand" }, { backend::netlib, "netlib" },
{ backend::rocblas, "rocblas" }, { backend::rocrand, "rocrand" },
{ backend::unsupported, "unsupported" }
{ backend::rocsolver, "rocsolver" }, { backend::unsupported, "unsupported" }
};

} //namespace mkl
Expand Down
8 changes: 7 additions & 1 deletion include/oneapi/mkl/detail/backends_table.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*******************************************************************************
* Copyright 2020-2021 Intel Corporation
* Copyright 2020-2022 Intel Corporation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -84,6 +84,12 @@ static std::map<domain, std::map<device, std::vector<const char*>>> libraries =
{
#ifdef ENABLE_MKLGPU_BACKEND
LIB_NAME("lapack_mklgpu")
#endif
} },
{ device::amdgpu,
{
#ifdef ENABLE_ROCSOLVER_BACKEND
LIB_NAME("lapack_rocsolver")
#endif
} },
{ device::nvidiagpu,
Expand Down
5 changes: 4 additions & 1 deletion include/oneapi/mkl/lapack.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*******************************************************************************
* Copyright 2021 Intel Corporation
* Copyright 2021-2022 Intel Corporation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -30,5 +30,8 @@
#ifdef ENABLE_CUSOLVER_BACKEND
#include "oneapi/mkl/lapack/detail/cusolver/lapack_ct.hpp"
#endif
#ifdef ENABLE_ROCSOLVER_BACKEND
#include "oneapi/mkl/lapack/detail/rocsolver/lapack_ct.hpp"
#endif

#include "oneapi/mkl/lapack/detail/lapack_rt.hpp"
50 changes: 50 additions & 0 deletions include/oneapi/mkl/lapack/detail/rocsolver/lapack_ct.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
/***************************************************************************
* Copyright (C) Codeplay Software Limited
* Copyright 2022 Intel Corporation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* For your convenience, a copy of the License has been included in this
* repository.
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
**************************************************************************/

#ifndef _DETAIL_ROCSOLVER_LAPACK_CT_HPP_
#define _DETAIL_ROCSOLVER_LAPACK_CT_HPP_

#if __has_include(<sycl/sycl.hpp>)
#include <sycl/sycl.hpp>
#else
#include <CL/sycl.hpp>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be guarded similarly as in include/oneapi/mkl/blas/detail/rocblas/blas_ct.hpp ?

#if __has_include(<sycl/sycl.hpp>)
#include <sycl/sycl.hpp>
#else
#include <CL/sycl.hpp>
#endif

#endif
#include <complex>
#include <cstdint>

#include "oneapi/mkl/types.hpp"
#include "oneapi/mkl/lapack/types.hpp"
#include "oneapi/mkl/detail/backend_selector.hpp"
#include "oneapi/mkl/lapack/detail/rocsolver/onemkl_lapack_rocsolver.hpp"

namespace oneapi {
namespace mkl {
namespace lapack {

#define LAPACK_BACKEND rocsolver
#include "lapack_ct.hxx"
#undef LAPACK_BACKEND

} // namespace lapack
} // namespace mkl
} // namespace oneapi

#endif //_DETAIL_ROCSOLVER_LAPACK_CT_HPP_
Loading