Skip to content

Fix warp perspective documentation #5815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 20, 2025
Merged

Fix warp perspective documentation #5815

merged 2 commits into from
Feb 20, 2025

Conversation

5had3z
Copy link
Contributor

@5had3z 5had3z commented Feb 9, 2025

Warp perspective requires 3x3 tensor list (not 1D), fixing documetation to reflect this requirement.

Category:

Bug fix (non-breaking change which fixes an issue)

Description:

Documentation doesn't reflect the assertions made in the code:

class WarpPerspective : ...
  void RunImpl(Workspace &ws) override {
    ....
    auto &matrix_input = ws.Input<GPUBackend>(1);
    DALI_ENFORCE(matrix_input.shape() ==
                         uniform_list_shape(matrix_input.num_samples(), TensorShape<2>(3, 3)),
                       make_string("Expected a uniform list of 3x3 matrices. "
                                   "Instead got data with shape: ",
                                   matrix_input.shape()));
    ....
  }
  ...
  ArgValue<float, 2> matrix_arg_{"matrix", spec_};
}

Additional information:

I'm not familiar with how to fix the automatically generated documetation of "matrix", .addOptionalArg<float>("matrix", ..., std::vector<float>({}), ...) generates "float or list of float or TensorList of float, optional, default = []" which is also incorrect. I think this problem could be pervasive in the codebase, where the inputs to operators should be 2D but the class used in .add(Optional)Arg is 1D std::vector. Maybe this is doable with ArgValue<float, 2> somehow, I'm not sure, I didn't test implementing/building this.

An additional pain point for this operator in general is that size has to be on CPU, but if I have some other target image I am warping to that is already on GPU, I can't use target.shape() because cpu()->gpu() isn't allowed without experimental feature activated.

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

@mzient
Copy link
Contributor

mzient commented Feb 10, 2025

Hello,
1.
The problem is indeed pervasive, but there's no easy fix.
DALI doesn't have constants with multiple dimensions, so we actually do accept a flattened 1D array, similarly to cv2.warpAffine which takes a 6-element list.
Adding first-class ND-array constants is something we might consider, but it's high effort and relatively low payoff feature.

Sorry, I misread this to be about WarpAffine - in case of WarpPerspective, things may be different, I'm not very familiar with the code of this operator.
2.
The exec_dynamic is no longer experimental.

@5had3z
Copy link
Contributor Author

5had3z commented Feb 10, 2025

WarpPerspective, things may be different

When I initially developed for GPU, I used numpy.flatten() on my homographies to make 1D as per the docs and got the error message that I added in the original post "Expected a uniform list of 3x3 matrices. Instead got data with shape: {9, 9, 9, 9, 9, 9}"

I've actually just tried to check that the CPU has the same behaviour with assering 3x3 mat, but I just got the error:
Assert on "creator_it != registry_.end()" failed: Operator "experimental__WarpPerspective" not registered for cpu.

@klecki
Copy link
Contributor

klecki commented Feb 14, 2025

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [23960901]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [23960901]: BUILD FAILED

@5had3z
Copy link
Contributor Author

5had3z commented Feb 18, 2025

Since there is no warp_perspective operator for CPU, only the CV-CUDA wrapper, I plan on adding one in my spare time, maybe over the weekend, just based on cv::warpPerspective. We can move this minor doc fix to that PR.

This is also the case for fn.experimental.debayer, only CUDA op available based on npp, an opencv wrapper is needed for CPU impl. I think the [0,0], [0,1] notation is also a bit awkward imo, and is different from NPP and OpenCV which is an enum.

@mzient
Copy link
Contributor

mzient commented Feb 19, 2025

Since there is no warp_perspective operator for CPU, only the CV-CUDA wrapper, I plan on adding one in my spare time, maybe over the weekend, just based on cv::warpPerspective. We can move this minor doc fix to that PR.

It's better to keep changes small.

This is also the case for fn.experimental.debayer, only CUDA op available based on npp, an opencv wrapper is needed for CPU impl. I think the [0,0], [0,1] notation is also a bit awkward imo, and is different from NPP and OpenCV which is an enum.

That notation was a conscious decision - it makes it easier to debayer cropped images, for example - something that, with enums, requires a lot of lookups and/or if-else ladders.

@klecki
Copy link
Contributor

klecki commented Feb 19, 2025

And sorry for the delay in merging this, but we have some small unrelated issue in the CI resulting in failed builds. When the build is successful I will merge this.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [24182434]: BUILD STARTED

@5had3z
Copy link
Contributor Author

5had3z commented Feb 19, 2025

No worries, I'll open up a new PR for warpPerspective CPU impl later when I get the free time.

The added benefit of that notation checks out, but I can't see any reference to cropping in the operator itself, I guess there might be some internal use elsewhere. It just gets converted to an enum from what I can see here. I'll more deeply look into it when doing an implementation that just wraps cv::cvtColor later. I find a balance of CPU and GPU preprocessing is critical for peak training througput in my case.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [24182434]: BUILD PASSED

@klecki klecki merged commit 29cf6c9 into NVIDIA:main Feb 20, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants