Skip to content

Update to Linux 6.8 drivers #344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2,165 commits into
base: master
Choose a base branch
from

Conversation

dumbbell
Copy link
Member

@dumbbell dumbbell commented Mar 3, 2025

This is the backport of the DRM drivers from Linux 6.8.

Progress:

Changes in Linux 6.8

You can read this Phoronix article to learn about the changes in the DRM drivers in Linux 6.8:
https://www.phoronix.com/news/Linux-6.8-DRM

Patches to linuxkpi

This update depends on the following patches to linuxkpi in FreeBSD.

These patches are maintained in the following repository and branch:
https://github.com/dumbbell/freebsd-src/tree/drm-related-linuxkpi-changes

Patches were submitted for review:

Firmware updates

There is an associated firmware update:

How to test

You need to run a recent FreeBSD 15-CURRENT to test it.

Here are some instructions:

  1. You need to checkout the FreeBSD src branch I mentionned, drm-related-linuxkpi-changes, and compile a kernel from that branch:

    git clone -b drm-related-linuxkpi-changes https://github.com/dumbbell/freebsd-src.git
    cd freebsd-src
    make -j8 buildkernel DEBUG_FLAGS=-g
    
    # This installs the kernel under another name, `kernel.drm`. Thus, you keep the default kernel
    # in case of trouble.
    sudo make installkernel DEBUG_FLAGS=-g INSTKERNNAME=kernel.drm
  2. You need to checkout the branch referenced in this pull request and compile it:

    git clone -b update-to-linux-6.8 https://github.com/dumbbell/drm-kmod.git
    cd drm-kmod
    make -j8 DEBUG_FLAGS=-g SYSDIR=/path/to/freebsd-src-from-step1/sys
    sudo make install DEBUG_FLAGS=-g SYSDIR=/path/to/freebsd-src-from-step1/sys KMODDIR=/boot/kernel.drm
    
  3. You need to checkout the drm-kmod-firmware associated update and compile the firmwares (yes, this is the same firmware update as for the Linux 6.7 update):

    git clone -b drm-6.7 https://github.com/dumbbell/drm-kmod-firmware.git
    cd drm-kmod-firmware
    make -j8 DEBUG_FLAGS=-g SYSDIR=/path/to/freebsd-src-from-step1/sys
    sudo make install DEBUG_FLAGS=-g SYSDIR=/path/to/freebsd-src-from-step1/sys KMODDIR=/boot/kernel.drm
    
  4. Load the relevant driver(s) as you usually do.

@dumbbell
Copy link
Member Author

dumbbell commented Mar 4, 2025

As of this writing, the amdgpu looks fine, but there is a regression with the i915 drivers (the screen is blank after loading the driver, but the computer "works").

@lutzbichler
Copy link

As of this writing, the amdgpu looks fine, but there is a regression with the i915 drivers (the screen is blank after loading the driver, but the computer "works").

The blank scrren with i915 goes away for me after changing DIV_ROUND_DOWN_ULL(x, n) to ((unsigned long long)(x) / (n))

@emaste
Copy link
Member

emaste commented Mar 11, 2025

changing DIV_ROUND_DOWN_ULL(x, n) to ((unsigned long long)(x) / (n))

Hrm, indeed the current DIV_ROUND_DOWN is wrong. Do you want to submit a pull request against freebsd-src?

Fixes: c4e0746e7d5bd ("LinuxKPI: Add helper macros IS_ALIGNED and DIV_ROUND_DOWN_ULL.")

emaste added a commit to emaste/freebsd that referenced this pull request Mar 11, 2025
From freebsd/drm-kmod#344 (comment)
Fixes: c4e0746 ("LinuxKPI: Add helper macros IS_ALIGNED and DIV_ROUND_DOWN_ULL.")
@dumbbell
Copy link
Member Author

The blank scrren with i915 goes away for me after changing DIV_ROUND_DOWN_ULL(x, n) to ((unsigned long long)(x) / (n))

Indeed, looking at the two newly used macros was my next step on Saturday but you beat me to it.

Do you want to submit a pull request against freebsd-src?

No objections from me!

@emaste
Copy link
Member

emaste commented Mar 12, 2025

I referenced this pull request from the change in my WIP testing tree. @lutzbichler if you submit a pull request I'll land that (so that it has proper author attribution), otherwise I'll edit the commit message to add a Reported by

emaste added a commit to emaste/freebsd that referenced this pull request Mar 12, 2025
From freebsd/drm-kmod#344 (comment)
Fixes: c4e0746 ("LinuxKPI: Add helper macros IS_ALIGNED and DIV_ROUND_DOWN_ULL.")
@emaste
Copy link
Member

emaste commented Mar 12, 2025

With this applied to my work tree I get a panic, reproduced below. Looking.

panic: uma: item 0xfffff800012b0980 did not belong to zone malloc-32
cpuid = 16
time = 1741793462
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe008fea7aa0
vpanic() at vpanic+0x136/frame 0xfffffe008fea7bd0
panic() at panic+0x43/frame 0xfffffe008fea7c30
item_ctor() at item_ctor+0x18d/frame 0xfffffe008fea7c80
malloc() at malloc+0x7d/frame 0xfffffe008fea7cc0
vdev_geom_io_start() at vdev_geom_io_start+0x24d/frame 0xfffffe008fea7cf0
zio_vdev_io_start() at zio_vdev_io_start+0x45e/frame 0xfffffe008fea7d40
zio_nowait() at zio_nowait+0x112/frame 0xfffffe008fea7d80
vdev_queue_io_done() at vdev_queue_io_done+0x228/frame 0xfffffe008fea7dd0
zio_vdev_io_done() at zio_vdev_io_done+0xc1/frame 0xfffffe008fea7e10
zio_execute() at zio_execute+0x7e/frame 0xfffffe008fea7e40
taskqueue_run_locked() at taskqueue_run_locked+0x1c7/frame 0xfffffe008fea7ec0
taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame 0xfffffe008fea7ef0
fork_exit() at fork_exit+0x87/frame 0xfffffe008fea7f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008fea7f30
--- trap 0xf588abe6, rip = 0x9c4ba32861d75426, rsp = 0xdbbf81d47935a452, rbp = 0x7cbe89bd7ff4d4 ---
Uptime: 36s

@dumbbell
Copy link
Member Author

dumbbell commented Mar 12, 2025

I’m running with that fix in my tree since last night and everything is stable so far. Note that I only load the i915 driver, I don’t actually use it most of the time, my external monitors are connected to the AMD GPU, it that makes a difference.

I also use ZFS on root FTR.

@emaste
Copy link
Member

emaste commented Mar 12, 2025

For better or worse this is trivially reproducible for me -- panic: uma: item <ptr> did not belong to zone <zone> for zones malloc-32, zone 128 Bucket, malloc-2048

@lutzbichler
Copy link

changing DIV_ROUND_DOWN_ULL(x, n) to ((unsigned long long)(x) / (n))

Hrm, indeed the current DIV_ROUND_DOWN is wrong. Do you want to submit a pull request against freebsd-src?

Fixes: c4e0746e7d5bd ("LinuxKPI: Add helper macros IS_ALIGNED and DIV_ROUND_DOWN_ULL.")

The attempt is here: freebsd/freebsd-src#1612
Not sure I did it right as I have never done before.

@emaste
Copy link
Member

emaste commented Mar 12, 2025

The attempt is here: freebsd/freebsd-src#1612
Not sure I did it right as I have never done before.

Was just fine and I have landed it already. Thank you for tracking it down!

@emaste
Copy link
Member

emaste commented Mar 12, 2025

@dumbbell are you building w/ INVARIANTS?

@emaste
Copy link
Member

emaste commented Mar 12, 2025

I observed the panic on Meteor Lake (8086:7dd5 f111:0009). Same image seems to be functional on Tiger Lake (8086:9a49 f111:0001).

@dumbbell
Copy link
Member Author

@dumbbell are you building w/ INVARIANTS?

Yes, I’m using a GENERIC kernel.

@aokblast
Copy link

Hello, I try the kernel and drm from your branch on Meteor Lake (The pci id is same as @emaste's one) by following your instructions and logging into X and it looks wired as the following picture shows:

IMG_9742

Also, in VT mode, I still suffered from the problem in 6.6.

@emaste
Copy link
Member

emaste commented Mar 25, 2025

@aokblast check the patch referenced in #332 (comment), I had similar corruption when testing 6.7 solved by that change

@emaste
Copy link
Member

emaste commented Mar 25, 2025

@dumbbell can you add to the instructions the steps for building and installing the firmware?

On my test Raptor Lake Dell kldload gets stuck (waiting in linux_schedule_timeout() from intel_dp_wait_source_oui()).

vgapci0@pci0:0:2:0:	class=0x030000 rev=0x04 hdr=0x00 vendor=0x8086 device=0xa721 subvendor=0x1028 subdevice=0x0c1d
    vendor     = 'Intel Corporation'
    device     = 'Raptor Lake-P [UHD Graphics]'
    class      = display
    subclass   = VGA

@dumbbell
Copy link
Member Author

@dumbbell can you add to the instructions the steps for building and installing the firmware?

Good idea, I updated the instructions for both the 6.7 and 6.8 updates.

@dumbbell
Copy link
Member Author

On my test Raptor Lake Dell kldload gets stuck (waiting in linux_schedule_timeout() from intel_dp_wait_source_oui()).

vgapci0@pci0:0:2:0:	class=0x030000 rev=0x04 hdr=0x00 vendor=0x8086 device=0xa721 subvendor=0x1028 subdevice=0x0c1d
    vendor     = 'Intel Corporation'
    device     = 'Raptor Lake-P [UHD Graphics]'
    class      = display
    subclass   = VGA

@emaste: Is this a new behaviour comparer to 6.7 or before?

Nicholas Kazlauskas and others added 8 commits March 29, 2025 17:48
[Why]
DMCUB can be in idle when we attempt to interface with the HW through
the GPINT mailbox resulting in a system hang.

[How]
Add dc_wake_and_execute_gpint() to wrap the wake, execute, sleep
sequence.

If the GPINT executes successfully then DMCUB will be put back into
sleep after the optional response is returned.

It functions similar to the inbox command interface.

Cc: Mario Limonciello <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: [email protected]
Reviewed-by: Hansen Dsouza <[email protected]>
Acked-by: Wayne Lin <[email protected]>
Signed-off-by: Nicholas Kazlauskas <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
[WHY?]
Part of the dc_state interface that deals with adding streams and planes should
remain public, while others that deal with internal status' and subvp should be
private to DC.

[HOW?]
Move and rename the public functions to dc_state.h and private functions to
dc_state_priv.h. Also add some additional functions for extracting subvp meta
data from the state.

Reviewed-by: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Jun Lei <[email protected]>
Acked-by: Wayne Lin <[email protected]>
Signed-off-by: Dillon Varone <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
[WHY?]
Phantom streams and planes were previously not referenced explcitly on creation.

[HOW?]
To reduce memory management complexity, add an additional phantom streams and planes
reference into dc_state, and move mall_stream_config to stream_status inside
the state to make it safe to modify in shallow copies. Also consildates any logic
that is affected by this change to dc_state.

Reviewed-by: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Jun Lei <[email protected]>
Acked-by: Wayne Lin <[email protected]>
Signed-off-by: Dillon Varone <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
[WHY&HOW]
Need to provide valid pointer to dc_state when getting subvp pipe type.

Reviewed-by: Alvin Lee <[email protected]>
Acked-by: Wayne Lin <[email protected]>
Signed-off-by: Dillon Varone <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
[WHY&HOW]
After refactoring dc_state, it is always constructed at the time of its
creation. Construction can only happen after dc resources are initialized, so
move creation to be after this.

Reviewed-by: George Shen <[email protected]>
Acked-by: Wayne Lin <[email protected]>
Signed-off-by: Dillon Varone <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
[WHY&HOW]
dml2_context should be deep copied from src to dst dc_state.

Reviewed-by: George Shen <[email protected]>
Acked-by: Wayne Lin <[email protected]>
Signed-off-by: Dillon Varone <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
[WHY]
Previous fix for multiple displays downstream of DP2 MST hub caused regression

[HOW]
Match sink IDs instead of sink struct addresses

Reviewed-by: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Charlene Liu <[email protected]>
Acked-by: Wayne Lin <[email protected]>
Signed-off-by: Michael Strauss <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
[Description]
There is a corner case where the symclk otg flag is cleared
when disabling the phantom pipe for subvp (because the phantom
and main pipe share the same link). This is undesired because
we need the maintain the correct symclk otg flag state for
the main pipe.

For now only clear the flag only for HDMI signal type, since
it's only set for HDMI signal type (phantom is virtual). The
ideal solution is to not clear it if the stream is phantom but
currently there's a bug that doesn't allow us to do this. Once
this issue is fixed the proper fix can be implemented.

Reviewed-by: Samson Tam <[email protected]>
Acked-by: Wayne Lin <[email protected]>
Signed-off-by: Alvin Lee <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
matt-auld and others added 12 commits March 29, 2025 17:49
Likely not a big deal for real users, but for consistency we should
respect the min_page_size here. Main issue is that bias allocations
turns into normal range allocation if the range and size matches
exactly, and in the next patch we want to add some unit tests for this
part of the api.

Signed-off-by: Matthew Auld <[email protected]>
Cc: Arunpravin Paneer Selvam <[email protected]>
Cc: Christian König <[email protected]>
Reviewed-by: Arunpravin Paneer Selvam <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Christian König <[email protected]>
[WHY]
Some eDP panels' ext caps don't write initial values. The value of
dpcd_addr (0x317) can be random and the backlight control interface
will be incorrect.

[HOW]
Add new panel patches to remove sink ext caps.

Cc: Mario Limonciello <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: [email protected] # 6.5.x
Cc: Tsung-hua Lin <[email protected]>
Cc: Chris Chi <[email protected]>
Reviewed-by: Wayne Lin <[email protected]>
Acked-by: Alex Hung <[email protected]>
Signed-off-by: Ryan Lin <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Error in mmu_interval_notifier_insert() can leave a NULL
notifier.mm pointer. Catch that and return early.

Fixes: ed29c2691188 ("drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7.")
Cc: <[email protected]> # v5.13+
[tursulin: Added Fixes and cc stable.]
Cc: Andi Shyti <[email protected]>
Cc: Shawn Lee <[email protected]>
Signed-off-by: Nirmoy Das <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Tvrtko Ursulin <[email protected]>
(cherry picked from commit db7bbd13f08774cde0332c705f042e327fe21e73)
Signed-off-by: Joonas Lahtinen <[email protected]>
If drm_kms_helper_poll=n the output poll work will only get scheduled
from drm_helper_probe_single_connector_modes() to handle a delayed
hotplug event. Since polling is disabled the work in this case should
just call drm_kms_helper_hotplug_event() w/o detecting the state of
connectors and rescheduling the work.

After commit d33a54e3991d after a delayed hotplug event above the
connectors did get re-detected in the poll work and the work got
re-scheduled periodically (since poll_running is also false if
drm_kms_helper_poll=n), in effect ignoring the drm_kms_helper_poll=n
kernel param.

Fix the above by calling only drm_kms_helper_hotplug_event() for a
delayed hotplug event if drm_kms_helper_hotplug_event=n, as was done
before d33a54e3991d.

Cc: Dmitry Baryshkov <[email protected]>
Reported-by: Ville Syrjälä <[email protected]>
Fixes: d33a54e3991d ("drm/probe_helper: sort out poll_running vs poll_enabled")
Reviewed-by: Dmitry Baryshkov <[email protected]>
Signed-off-by: Imre Deak <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
The icl+ power well code currently assumes that every AUX power
well maps to an encoder which is using said power well. That is
by no menas guaranteed as we:
- only register encoders for ports declared in the VBT
- combo PHY HDMI-only encoder no longer get an AUX CH since
  commit 9856308c94ca ("drm/i915: Only populate aux_ch if really needed")

However we have places such as intel_power_domains_sanitize_state()
that blindly traverse all the possible power wells. So these bits
of code may very well encounbter an aux power well with no associated
encoder.

In this particular case the BIOS seems to have left one AUX power
well enabled even though we're dealing with a HDMI only encoder
on a combo PHY. We then proceed to turn off said power well and
explode when we can't find a matching encoder. As a short term fix
we should be able to just skip the PHY related parts of the power
well programming since we know this situation can only happen with
combo PHYs.

Another option might be to go back to always picking an AUX CH for
all encoders. However I'm a bit wary about that since we might in
theory end up conflicting with the VBT AUX CH assignment. Also
that wouldn't help with encoders not declared in the VBT, should
we ever need to poke the corresponding power wells.

Longer term we need to figure out what the actual relationship
is between the PHY vs. AUX CH vs. AUX power well. Currently this
is entirely unclear.

Cc: [email protected]
Fixes: 9856308c94ca ("drm/i915: Only populate aux_ch if really needed")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10184
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Imre Deak <[email protected]>
(cherry picked from commit 6a8c66bf0e565c34ad0a18f820e0bb17951f7f91)
Signed-off-by: Joonas Lahtinen <[email protected]>
The DSC HW state of DP connectors is read out during driver loading and
system resume in intel_modeset_update_connector_atomic_state(). This
function is called for all connectors though and so the state of DSI
connectors will also get updated incorrectly, triggering a WARN there
wrt. the DSC decompression AUX device.

Fix the above by moving the DSC state readout to a new DP connector
specific sync_state() hook. This is anyway the logical place to update
the connector object's state vs. the connector's atomic state.

Fixes: b2608c6b3212 ("drm/i915/dp_mst: Enable MST DSC decompression for all streams")
Reported-and-tested-by: Drew Davenport <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]
Reviewed-by: Ankit Nautiyal <[email protected]>
Signed-off-by: Imre Deak <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit a62e145981500996ea76af3d740ce0c0d74c5be0)
Signed-off-by: Joonas Lahtinen <[email protected]>
Move psr_init_dpcd() from init-connector to connector-detect
function. The dpcd probe for checking panel replay capability
for external dp connector is causing delay during boot which can
be optimized by moving dpcd probe to connector specific detect().

v1: Initial version.
v2: Add details in commit description. [Jani]

Suggested-by: Ville Syrjälä <[email protected]>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10284
Signed-off-by: Animesh Manna <[email protected]>
Fixes: cceeaa312d39 ("drm/i915/panelreplay: Enable panel replay dpcd initialization for DP")
Reviewed-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 1cca19bf296fae0636a637b48d195ac6b4d430c9)
Signed-off-by: Joonas Lahtinen <[email protected]>
Add an if condition for gfx activity because the scaling has been changed after smu fw version 5d4600.
And remove a warning log.

Signed-off-by: Li Ma <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected] # 6.7.x
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:6683 amdgpu_dm_connector_funcs_force()
warn: variable dereferenced before check 'dc_link' (see line 6663)

Fixes: 967176179215 ("drm/amd/display: fix null-pointer dereference on edid reading")
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Melissa Wen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Need to check the offset bits for values greater than 255.

v2: also update amdgpu_dm_connector values.

Suggested-by: Mano Ségransan <[email protected]>
Tested-by: Mano Ségransan <[email protected]>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3203
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
Fix the pwm_mode value error which used for
pwm1_enable setting

Signed-off-by: Ma Jun <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
Differences were reviewed using e.g.:

    diff -Nau -pX scripts/diffignore \
      drm-kmod/drivers/gpu/drm/ \
      linux/drivers/gpu/drm/

    diff -Naur -pX scripts/diffignore \
      drm-kmod/drivers/gpu/drm/amd/amdgpu/ \
      linux/drivers/gpu/drm/amd/amdgpu/

    diff -Naur -pX scripts/diffignore \
      drm-kmod/drivers/gpu/drm/i915/ \
      linux/drivers/gpu/drm/i915/

    diff -Naur -pX scripts/diffignore \
      drm-kmod/include/drm/ \
      linux/include/drm/
@dumbbell dumbbell force-pushed the update-to-linux-6.8 branch from 18b3846 to b3074b8 Compare March 29, 2025 16:54
@aokblast
Copy link

aokblast commented Apr 1, 2025

@aokblast check the patch referenced in #332 (comment), I had similar corruption when testing 6.7 solved by that change

Tried it. The screen is teared without firmware. With firmware, the screen goes blank.

@slw
Copy link

slw commented Apr 1, 2025

I am try this on ThinkBook 14 G6+ IM

vgapci0@pci0:0:2:0:     class=0x030000 rev=0x08 hdr=0x00 vendor=0x8086 device=0x7d55 subvendor=0x17aa subdevice=0x384c
    vendor     = 'Intel Corporation'
    device     = 'Meteor Lake-P [Intel Arc Graphics]'
    class      = display
    subclass   = VGA
    cap 09[40] = vendor (length 12) Intel cap 0 version 1
    cap 10[70] = PCI-Express 2 root endpoint max data 128(128) FLR RO NS
                 max read 128
                 link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1)
    cap 05[ac] = MSI supports 1 message, 64 bit, vector masks enabled with 1 message
    cap 01[d0] = powerspec 3  supports D0 D3  current D0
    ecap 0000[100] = unknown 1
    ecap 001b[110] = Process Address Space ID 1
    ecap 000f[200] = ATS 1
    ecap 0015[420] = Resizable BAR 1
    ecap 0010[320] = SR-IOV 1 IOV disabled, Memory Space disabled, ARI disabled
                     0 VFs configured out of 7 supported
                     First VF RID Offset 0x0001, VF RID Stride 0x0001
                     VF Device ID 0x7d55
                     Page Sizes: 4096 (enabled), 8192, 65536, 262144, 1048576, 4194304
    ecap 0018[400] = LTR 1

after try kldload i915kms I am got dmesg

[1640] iic0: <I2C generic I/O> on iicbus0
[1640] iic1: <I2C generic I/O> on iicbus1
[1640] iic2: <I2C generic I/O> on iicbus2
[1640] iic3: <I2C generic I/O> on iicbus3
[1640] iic4: <I2C generic I/O> on iicbus4
[1641] <6>[drm] Got Intel graphics stolen memory base 0x0, size 0x0
[1641] drmn0: <drmn> on vgapci0
[1641] vgapci0: child drmn0 requested pci_enable_io
[1641] vgapci0: child drmn0 requested pci_enable_io
[1641] drmn0: [drm] GT0: Incompatible option enable_guc=-1 - undocumented flag
[1641] drmn0: [drm] GT1: Incompatible option enable_guc=-1 - undocumented flag
[1641] i915/mtl_dmc.bin: could not load binary firmware /boot/firmware/i915/mtl_dmc.bin either
[1641] mtl_dmc.bin: could not load binary firmware /boot/firmware/mtl_dmc.bin either
[1641] i915_mtl_dmc.bin: could not load binary firmware /boot/firmware/i915_mtl_dmc.bin either
[1641] drmn0: successfully loaded firmware image 'i915/mtl_dmc.bin'
[1641] drmn0: [drm] Finished loading DMC firmware i915/mtl_dmc.bin (v2.23)
[1641] lkpi_iic0: <LinuxKPI I2C> on drmn0
[1641] iicbus5: <Philips I2C bus> on lkpi_iic0
[1641] iic5: <I2C generic I/O> on iicbus5
[1641] lkpi_iic1: <LinuxKPI I2C> on drmn0
[1641] iicbus6: <Philips I2C bus> on lkpi_iic1
[1641] iic6: <I2C generic I/O> on iicbus6
[1641] lkpi_iic2: <LinuxKPI I2C> on drmn0
[1641] iicbus7: <Philips I2C bus> on lkpi_iic2
[1641] iic7: <I2C generic I/O> on iicbus7
[1641] lkpi_iic3: <LinuxKPI I2C> on drmn0
[1641] iicbus8: <Philips I2C bus> on lkpi_iic3
[1641] iic8: <I2C generic I/O> on iicbus8
[1641] lkpi_iic4: <LinuxKPI I2C> on drmn0
[1641] iicbus9: <Philips I2C bus> on lkpi_iic4
[1641] iic9: <I2C generic I/O> on iicbus9
[1641] lkpi_iic5: <LinuxKPI I2C> on drmn0
[1641] iicbus10: <Philips I2C bus> on lkpi_iic5
[1641] iic10: <I2C generic I/O> on iicbus10
[1641] lkpi_iic6: <LinuxKPI I2C> on drmn0
[1641] iicbus11: <Philips I2C bus> on lkpi_iic6
[1641] iic11: <I2C generic I/O> on iicbus11
[1641] lkpi_iic7: <LinuxKPI I2C> on drmn0
[1641] iicbus12: <Philips I2C bus> on lkpi_iic7
[1641] iic12: <I2C generic I/O> on iicbus12
[1641] lkpi_iic8: <LinuxKPI I2C> on drmn0
[1641] iicbus13: <Philips I2C bus> on lkpi_iic8
[1641] iic13: <I2C generic I/O> on iicbus13

and kldload stuck on
load: 0.01 cmd: kldload 5359 [sched] 8.45r 0.00u 0.67s 2% 2372k
mi_switch+0x172 sleepq_switch+0x109 sleepq_timedwait+0x4b linux_add_to_sleepqueue+0x92 linux_schedule_timeout+0x7b intel_dp_wait_source_oui+0xea intel_dp_aux_init_backlight_funcs+0xb6 intel_backlight_init_funcs+0x9b intel_panel_init+0x24 intel_dp_init_connector+0xced intel_ddi_init_dp_connector+0x93 intel_ddi_init+0xac4 intel_bios_for_each_encoder+0x35 intel_setup_outputs+0x216 intel_display_driver_probe_nogem+0x24c i915_driver_probe+0x4e8 linux_pci_attach_device+0x430 device_attach+0x45b

@olevole
Copy link

olevole commented Apr 3, 2025

no luck: blank screen or crash. kldload stuck:

root@oybsd:~ # kldload snp
load: 0.08  cmd: kldload 5411 [kldbusy] 2.37r 0.00u 0.00s 0% 2304k
load: 0.07  cmd: kldload 5411 [kldbusy] 4.71r 0.00u 0.00s 0% 2304k
load: 0.07  cmd: kldload 5411 [kldbusy] 6.09r 0.00u 0.00s 0% 2304k
load: 0.07  cmd: kldload 5411 [kldbusy] 6.80r 0.00u 0.00s 0% 2304k
^C

in dmesg:

...
drmn0: [drm] GT0: Resetting chip for stopped heartbeat on bcs'0
drmn0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.36.0
drmn0: [drm] GT0: GUC: submission enabled
drmn0: [drm] GT0: GUC: SLPC enabled
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GT0: Resetting chip for stopped heartbeat on bcs'0
drmn0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.36.0
drmn0: [drm] GT0: GUC: submission enabled
drmn0: [drm] GT0: GUC: SLPC enabled
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GT0: Resetting chip for stopped heartbeat on bcs'0
drmn0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.36.0
drmn0: [drm] GT0: GUC: submission enabled
drmn0: [drm] GT0: GUC: SLPC enabled
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GT0: Resetting chip for stopped heartbeat on bcs'0
drmn0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.36.0
drmn0: [drm] GT0: GUC: submission enabled
drmn0: [drm] GT0: GUC: SLPC enabled
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GT0: Resetting chip for stopped heartbeat on bcs'0
drmn0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.36.0
drmn0: [drm] GT0: GUC: submission enabled
drmn0: [drm] GT0: GUC: SLPC enabled
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GT0: Resetting chip for stopped heartbeat on bcs'0
drmn0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.36.0
drmn0: [drm] GT0: GUC: submission enabled
drmn0: [drm] GT0: GUC: SLPC enabled
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GT0: Resetting chip for stopped heartbeat on bcs'0
drmn0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.36.0
drmn0: [drm] GT0: GUC: submission enabled
drmn0: [drm] GT0: GUC: SLPC enabled
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GT0: Resetting chip for stopped heartbeat on bcs'0
drmn0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.36.0
drmn0: [drm] GT0: GUC: submission enabled
drmn0: [drm] GT0: GUC: SLPC enabled
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GT0: Resetting chip for stopped heartbeat on bcs'0
drmn0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.36.0
drmn0: [drm] GT0: GUC: submission enabled
drmn0: [drm] GT0: GUC: SLPC enabled
drmn0: [drm] GPU HANG: ecode 12:0:00000000
drmn0: [drm] GT0: Resetting chip for stopped heartbeat on bcs'0
drmn0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.36.0
drmn0: [drm] GT0: GUC: submission enabled
drmn0: [drm] GT0: GUC: SLPC enabled
drmn0: [drm] GPU HANG: ecode 12:0:00000000
...

Config:

vgapci0@pci0:0:2:0:     class=0x030000 rev=0x08 hdr=0x00 vendor=0x8086 device=0x7d55 subvendor=0x17aa subdevice=0x3f96
    vendor     = 'Intel Corporation'
    device     = 'Meteor Lake-P [Intel Arc Graphics]'
    class      = display
    subclass   = VGA
CPU microcode: updated from 0x1c to 0x20
CPU: Intel(R) Core(TM) Ultra 7 155H (2995.20-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0xa06a4  Family=0x6  Model=0xaa  Stepping=4
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7ffafbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>
  Structured Extended Features=0x239c27eb<FSGSBASE,TSCADJ,BMI1,AVX2,FDPEXC,SMEP,BMI2,ERMS,INVPCID,NFPUSG,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,PROCTRACE,SHA>
  Structured Extended Features2=0x994007bc<UMIP,PKU,OSPKE,WAITPKG,GFNI,VAES,VPCLMULQDQ,RDPID,MOVDIRI,MOVDIR64B>
  Structured Extended Features3=0xfc18c410<FSRM,MD_CLEAR,IBT,IBPB,STIBP,L1DFL,ARCH_CAP,CORE_CAP,SSBD>
  XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
  IA32_ARCH_CAPS=0xd89fd6b<RDCL_NO,IBRS_ALL,SKIP_L1DFL_VME,MDS_NO,TAA_NO>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
  TSC: P-state invariant, performance statistics
real memory  = 17179869184 (16384 MB)
avail memory = 15853375488 (15118 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <LENOVO CB-01   >
WARNING: L3 data cache covers more APIC IDs than a package (6 > 3)
FreeBSD/SMP: Multiprocessor System Detected: 16 CPUs
FreeBSD/SMP: Non-uniform topology
..

( CPU Hyper-threading: off )

@olevole
Copy link

olevole commented Apr 3, 2025

hw.i915kms.enable_guc=0

helps, but there is no acceleration

@emaste
Copy link
Member

emaste commented Apr 10, 2025

@emaste: Is this a new behaviour comparer to 6.7 or before?

I believe I saw this with 6.7 on this machine as well, but will have to double-check.

@emaste
Copy link
Member

emaste commented Apr 10, 2025

I haven't been able to reproduce the panic again. I do have an interesting update on the corruption (on Raptor Lake-P [UHD Graphics]). I ran startx and observed the corruption as @aokblast reported, then left the machine idle until the screen blanked. After cursor movement + unblanking the display was fine. Then, when watching a video in Firefox switching window focus would lead to a different kind of corruption on each switch.

@emaste
Copy link
Member

emaste commented Apr 10, 2025

I haven't been able to reproduce the panic again.

I think I may have confused myself about the issues that appeared on various machines. The Dell laptop (Raptor Lake) has the video corruption and no panic. The Framework Core Ultra (Meteor Lake) is the one that panicked, and still does.

@emaste
Copy link
Member

emaste commented Apr 11, 2025

I'm now running this on my daily driver, with

vgapci0@pci0:0:2:0:     class=0x030000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x
9a49 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'TigerLake-LP GT2 [Iris Xe Graphics]'
    class      = display
    subclass   = VGA

I saw corruption like @aokblast reported when first starting X, but it stopped after switching to vty0 and back and it has been "fine" since.

@slw
Copy link

slw commented Apr 11, 2025

hw.i915kms.enable_guc=0 and kldload i915kms got a panic. core.txt attached
core.txt

@emaste
Copy link
Member

emaste commented Apr 11, 2025

@slw I saw that panic (address %p(%p) has not been allocated) on my Meteor Lake as well, immediately upon load. Sometimes that panic, sometimes the did not belong to zone.

@slw
Copy link

slw commented Apr 11, 2025

@slw I saw that panic (address %p(%p) has not been allocated) on my Meteor Lake as well, immediately upon load. Sometimes that panic, sometimes the did not belong to zone.

yes, this Meteor Lake too

vgapci0@pci0:0:2:0:     class=0x030000 rev=0x08 hdr=0x00 vendor=0x8086 device=0x7d55 subvendor=0x17aa subdevice=0x384c
    vendor     = 'Intel Corporation'
    device     = 'Meteor Lake-P [Intel Arc Graphics]'
    class      = display
    subclass   = VGA

@evadot
Copy link
Contributor

evadot commented Apr 11, 2025

hw.i915kms.enable_guc=0 and kldload i915kms got a panic. core.txt attached core.txt

I'm pretty sure that GuC is mandatory for this GPU.

@slw
Copy link

slw commented Apr 11, 2025

I'm pretty sure that GuC is mandatory for this GPU.

enable_guc=3 core attached
core.3.txt

@benjsc
Copy link

benjsc commented Apr 14, 2025

Noting all patches are now struckout as complete, does this mean https://github.com/freebsd/freebsd-src main can now be used as apposed to the drm-related-linuxkpi-changes branch of dumbell's repo?

@dumbbell
Copy link
Member Author

Noting all patches are now struckout as complete, does this mean https://github.com/freebsd/freebsd-src main can now be used as apposed to the drm-related-linuxkpi-changes branch of dumbell's repo?

No, because freebsd-src patches for DRM in Linux 6.7 are still being reviewed and improved (see #332). This pull request depends on them too.

@benjsc
Copy link

benjsc commented Apr 16, 2025

With 6.6 crashing sometimes multiple times daily due to (#333) and hence failing the WIFE factor. I decided to give 6.8 a try, results so far are pretty good. This box is a daily media server, inet router and mythtv frontend using vaapi and opengl.
I've had one crash but that is with Xorg on continuous restart whilst I was rebuilding mythtv.

I'm running:

FreeBSD soyo.clearchain.com 15.0-CURRENT FreeBSD 15.0-CURRENT #1 drm-related-linuxkpi-changes-n276426-95656f357432: Mon Apr 14 22:24:59 ACST 2025     [email protected]:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64

/root/work/freebsd/drm-kmod being:
commit b3074b83b5e121ca7da7c249db0d44d7b46a946d (HEAD -> update-to-linux-6.8, dumbbell/update-to-linux-6.8)

And firmware being:
commit 7b3039ebb986b79206ef55a3e7d876043c548a90 (HEAD -> drm-6.7, origin/drm-6.7)

on Alderlake hardware:

vgapci0@pci0:0:2:0:     class=0x030000 rev=0x00 hdr=0x00 vendor=0x8086 device=0x46d1 subvendor=0x0000 subdevice=0x0000
    vendor     = 'Intel Corporation'
    device     = 'Alder Lake-N [UHD Graphics]'
    class      = display
    subclass   = VGA

with:
hw.i915kms.enable_guc: 2

drmn0: [drm] GT0: GuC firmware i915/tgl_guc_70.bin version 70.36.0
drmn0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
drmn0: [drm] GT0: HuC: authenticated for all workloads
drmn0: [drm] GT0: GUC: submission disabled
drmn0: [drm] GT0: GUC: SLPC disabled
[drm] Initialized i915 1.6.0 20230929 for drmn0 on minor 0

I do get the corrupted color space issue that emaste had in #332 (comment)_ but not at boot. Vt is clean, Xorg starts, the colorspace becomes corrupted. Switching to VT, colorspace is still broken, switch back to Xorg it's initially corrupt then restores correctly.

Crash was:

image
image
image

So far seems very stable - nice work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.