Skip to content

frames decoder fixes: avoid overflow, handle multiple frames per packet #5911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 9, 2025

Conversation

jantonguirao
Copy link
Contributor

Category:

Bug fix (non-breaking change which fixes an issue)

Description:

This PR addresses two critical issues in the frames decoder implementation:

  1. Fixes a potential overflow issue in frame pointer arithmetic by using ptrdiff_t for pointer calculations and adding bounds checking
  2. Improves handling of multiple frames per packet in the GPU decoder by properly managing the PTS (Presentation Time Stamp) queue

The changes ensure more robust frame decoding by preventing memory access issues and correctly handling frame ordering when multiple frames are present in a single packet.

Additional information:

Affected modules and functionalities:

  • Modified dali/operators/video/frames_decoder_base.cc:
    • Improved frame pointer arithmetic safety
    • Enhanced frame copying logic with better pointer management
  • Modified dali/operators/video/frames_decoder_gpu.cc:
    • Fixed PTS queue management
    • Added more detailed logging for debugging
    • Improved frame buffer handling

Key points relevant for the review:

  • The pointer arithmetic changes in frames_decoder_base.cc are critical for preventing potential buffer overflows
  • The PTS queue management changes in frames_decoder_gpu.cc ensure correct frame ordering
  • Added assertions and improved logging for better debugging capabilities

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [28009735]: BUILD STARTED

@jantonguirao jantonguirao force-pushed the video_decoder_crash_fix branch from 6721d99 to 5de3818 Compare May 6, 2025 12:24
@JanuszL JanuszL self-assigned this May 6, 2025
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [28009735]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [28017933]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [28017933]: BUILD FAILED

for (auto &[frame_id, i] : frame_ids) {
auto out_frame_start = data + ptrdiff_t(i) * FrameSize();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto out_frame_start = data + ptrdiff_t(i) * FrameSize();
uint8_t *out_frame_start = data + ptrdiff_t(i) * FrameSize();

...since we assign it to explicitly typed last_out_frame_start...

@@ -603,20 +603,21 @@ void FramesDecoderBase::DecodeFramesImpl(uint8_t *data,
DALI_ENFORCE(constant_frame != nullptr || boundary_type != boundary::BoundaryType::CONSTANT,
make_string("Constant frame must be provided if boundary type is CONSTANT"));

int last_out_frame_idx = -1;
uint8_t* last_out_frame_start = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit

Suggested change
uint8_t* last_out_frame_start = nullptr;
uint8_t *last_out_frame_start = nullptr;

// Take pts of the currently decoded frame
int current_pts = piped_pts_.front();
piped_pts_.pop();
if (current_pts_ == -1) { // if != -1, it's not the first frame from the packet
Copy link
Contributor

@mzient mzient May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fun fact: as far as I know, presentation timestamp (PTS) is quite arbitrary and can be negative, so it'd be better to use a more obviously incorrect value (AV_NOPTS_VALUE) or alternatively use optional<int64_t>

@@ -198,6 +198,7 @@ class DLL_PUBLIC FramesDecoderGpu : public FramesDecoderBase {
std::vector<BufferedFrame> frame_buffer_;

std::queue<int> piped_pts_;
int current_pts_ = -1;
Copy link
Contributor

@mzient mzient May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int current_pts_ = -1;
int64_t current_pts_ = AV_NOPTS_VALUE;

This is - and always was - too small. Use int64_t.

@@ -198,6 +198,7 @@ class DLL_PUBLIC FramesDecoderGpu : public FramesDecoderBase {
std::vector<BufferedFrame> frame_buffer_;

std::queue<int> piped_pts_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::queue<int> piped_pts_;
std::queue<int64_t> piped_pts_;

Signed-off-by: Joaquin Anton Guirao <[email protected]>
Signed-off-by: Joaquin Anton Guirao <[email protected]>
@jantonguirao jantonguirao force-pushed the video_decoder_crash_fix branch from 7462101 to 0437dc1 Compare May 8, 2025 12:10
Signed-off-by: Joaquin Anton Guirao <[email protected]>
@jantonguirao
Copy link
Contributor Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [28145200]: BUILD STARTED

// if != AV_NOPTS_VALUE, it's not the first frame from the packet
// first frame from this packet, pop it from the queue and remember it.
// If there are no piped pts, use the last pts available (unexpected extra frame)
if (!piped_pts_.empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what happens if we have 2 packets in the processing, and the first will produce two frames and the last only one. We will affiliate the first frame with first packet, and the second and third with the second packet.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [28145200]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [28145200]: BUILD PASSED

@jantonguirao jantonguirao merged commit 5e46b46 into NVIDIA:main May 9, 2025
7 checks passed
mdabek-nvidia pushed a commit to mdabek-nvidia/DALI that referenced this pull request Jun 7, 2025
…et (NVIDIA#5911)

Fixes a potential overflow issue in frame pointer arithmetic by using ptrdiff_t for pointer calculations and adding bounds checking
Improves handling of multiple frames per packet in the GPU decoder by properly managing the PTS (Presentation Time Stamp) queue

Signed-off-by: Joaquin Anton Guirao <[email protected]>
Signed-off-by: Marek Dabek <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants