Skip to content

vk: Enable use of a passthrough DMA layer if supported #9613

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jan 24, 2021

Conversation

kd-11
Copy link
Contributor

@kd-11 kd-11 commented Jan 16, 2021

This PR allows the GPU to read and write directly into host allocated memory which can give some performance improvements in some cases. It also includes misc fixes in other areas to tighten up the supporting implementation.
Summary

  • Redirects all texture upload routines to use the pre-existing DMA emulation mechanism
  • Solves a RAW hazard when extending/merging DMA sections by just discarding contents altogether
  • Allows special DMA block that is passthrough to the physical GPU
  • Fixes a synchronization leak in custom AMD GPU event object
  • Adds a hotpath for utils::memory_protect that I found when profiling. There is a decent performance uplift from just assuming the whole range is likely contiguous and falling back to chunked protection in case of failure.

TODO

  • Unbreak OpenGL, needs further testing.
  • Poor performance in some cases when using RADV
  • Investigate NVIDIA problems and exclude if no solution is found.
  • Investigate data corruption and crashing when using DMA layer (old bug)
  • GT5 texture mangling

@hvhbot
Copy link

hvhbot commented Jan 16, 2021

GT5 crashes with out of video memory error
RPCS3.log

@Yahfz
Copy link
Contributor

Yahfz commented Jan 16, 2021

GT5 crashes with out of video memory error
RPCS3.log

That's an nvidia issue and he's already aware. Thanks for reporting it though.

@kd-11
Copy link
Contributor Author

kd-11 commented Jan 16, 2021

To clarify: there is a problem with nvidia drivers that I'll fix as soon as my PC is back to working order. I'll notify here when its safe to test on nvidia again.

@kd-11 kd-11 marked this pull request as draft January 17, 2021 10:17
@kd-11 kd-11 added the In Progress This issue is actively being investigated at the moment. label Jan 18, 2021
@Jacoby1218
Copy link
Contributor

F1 2014 crashes when exiting the garage.
RPCS3 (4).zip

@MSuih
Copy link
Member

MSuih commented Jan 19, 2021

This pull request is known to have problems, wait until kd asks for more testing

@kd-11
Copy link
Contributor Author

kd-11 commented Jan 19, 2021

The PR should be safe to test again (vulkan only). OpenGL side is still under development.

@proganime1200
Copy link

how do i enable this?

@AniLeo
Copy link
Member

AniLeo commented Jan 20, 2021

It's enabled by default, there's no setting

@proganime1200
Copy link

It's enabled by default, there's no setting

Thanks ! Trying to test this ill post my experience soon

@hvhbot
Copy link

hvhbot commented Jan 20, 2021

image
Some textures are incorrect in GT5

@proganime1200
Copy link

proganime1200 commented Jan 20, 2021

On my i5 3470 gtx 750 ti it boost my fps from 30 fps on persona 5 attic room to 35 fps with windows 10 64 bit 8gb ram latest nvidia driver

@hvhbot
Copy link

hvhbot commented Jan 20, 2021

Also this is not present on master and reproducable, and 3d graphics work fine
RSX capture: BCES00569_20210120174245_capture.zip

@kd-11
Copy link
Contributor Author

kd-11 commented Jan 23, 2021

Poor performance was because of very bad settings I was using, turned out to be a false alarm. I believe that just leaves GT5 textures issue.

@kd-11
Copy link
Contributor Author

kd-11 commented Jan 23, 2021

@hvhbot I need your log file to replicate settings.

@hvhbot
Copy link

hvhbot commented Jan 23, 2021

RPCS3.log.gz
Retested, issue is still here (top left and bottom left)
image

kd-11 added 4 commits January 23, 2021 14:42
- It is not possible to emulate passthrough memory cleanly, and we don't need to
  A stupid race condition appears when trying to synchronize DMA blocks with memory inheritance.
  Since the usage pattern is to acquire a range and then load or write+flush, this new data is going to be..
  overwritten by the commandbuffer execution sequence later. Acquiring a scratch buffer to hold CPU content during the transition is not worth the effort..
  as the data will be destroyed anyway during the transfer process immediately afterwards.

  Fixes data corruption when moving data around using the emulated DMA passthrough
@kd-11 kd-11 removed the In Progress This issue is actively being investigated at the moment. label Jan 23, 2021
@kd-11
Copy link
Contributor Author

kd-11 commented Jan 23, 2021

@hvhbot Retest with new build once it compiles.
Turns out GT engine likes to constantly map and unmap pages which is causing the problem. I have made it so that memory map events also unmap data from the GPU so that we don't have the GPU reading from the wrong pages.
EDIT:
Nvm its still broken on NVIDIA. I'll add a toggle in GUI to disable it in case of corruption.

@kd-11 kd-11 changed the title [WIP] vk: Enable use of a passthrough DMA layer if supported vk: Enable use of a passthrough DMA layer if supported Jan 23, 2021
@kd-11
Copy link
Contributor Author

kd-11 commented Jan 23, 2021

Passthrough of cell memory has been disabled completely on NVIDIA and will use the slower copy mechanism. This is due to a driver quirk, nothing we can do about it. The driver seems to be caching page mapping until a block is erased, so unmapping a page and mapping it somewhere else doesn't work even if we create a new allocation. Since the first allocation still exists and cannot be removed until the draw calls are completed, it still uses the old mapping. This technically makes sense as an optimization but it really breaks games that do a lot of map/unmap/remap like GT series.

@kd-11 kd-11 marked this pull request as ready for review January 23, 2021 13:35
@Asinin3
Copy link
Contributor

Asinin3 commented Jan 23, 2021

Before:
rpcs3_h8ILhMWz1y
After:
rpcs3-dma-fast_kc35TXZC9Z

kd-11 added 9 commits January 24, 2021 12:01
Some spec violations fixes
Make the option dynamic
- Fix range chaining.
- Add validation checks that no overlaps exist.
- Just use the reversed type instead.
  The new uploader backend combines swizzle+swap so there is no need for tricks anymore
- The driver seems to cache page mapping as long as allocation has not been removed
- This is undesirable as we cannot stop the emulator to remove stale allocations every time a page is unmapped
@DefaltBR
Copy link

DefaltBR commented Feb 7, 2021

The Last of Us (BCUS98174) now has artifacts/flickering textures (including HUD), objects and even landscape after this PR... Confirmed using master 0.0.14-11605 (which is the one before this PR got merged).

Here's the log for the 11605, before this PR: RPCS3.log

And here's the log for the latest 11698 build: RPCS3.log

@dante3732
Copy link

Regression: Rainbow Flashing Textures on Killzone 3 on AMD #9791

@MSuih
Copy link
Member

MSuih commented Feb 16, 2021

Tag #9613 in the issue and note will appear in this pull request. There's one such example just above your message. It's annoying for the people who commented here to get notifications about new messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.