ThinkPad T480 with eGPU on Linux
Table of contents
tl;dr
The internal dGPU need to be disabled, through ACPI table overlay.For some reason I want to use an eGPU with my laptop. The eGPU is an RTX 2080TI, connected to my computer through a thunderbolt 3 eGPU enclosure. There was a few issues preventing me from using the eGPU smoothly.
Originπ
The dedicated GPU on T480(MX150) is not powerful enough to drive a modem 3A game or a small LLM. I need an upgrade, without replacing existing hardware.
I admit building a new customized PC is a better solution, but the budget was tight.
Major hardware and OSπ
- ThinkPad T480 with i7-8550U and MX150
- Cheap eGPU enclosure motherboard(with no power supply, costs around 100$)
- NVIDIA RTX 2080TI
- Arch Linux
The problemπ
When using Windows LTSC 2019, everything works fine. But under linux, GDM freezes shortly after connecting the eGPU enclosure. nvidia-smi
stucks, and kernel log reads "RmInitAdapter failed! (0xXX:0xXX:XXXX)"(X is some number).
If I reboot in Windows with eGPU attached, GDM and nvidia-smi
works fine, but eGPU is somehow unusable.
I thought it was a thunderbolt issue, and I applied some quirks. The problem still presents.
I tried to use the open-source kernel modules, and it works, at least not killing GDM. The warnings in kernel logs are also annoying: MX150 is not supported by the open-source kernel mosdules. MX150 is still draining power, and I'm not comfortable with it.
So in a nut shell, I tried to enable early KMS, not a solution. I tried different graphics drivers, it fixes the problem with a new one. Some guys in the forum say that a BIOS update could fully fix the problem, but my computer is already running the latest BIOS.
Anyway, I would like to disable the internal GPU MX150. I know there must be a solution, as bumblebee project can fully shutdown the GPU with some hacks. However bumblebee drivers are legacy now. I need another solution.
To turn off the internal dGPUπ
Luckily I came across an article by Major Hayden describing a way to disable NVIDIA GPU on T490 through ACPI modding. I thought it could be similar on a T480.
Following the instructions, I dumped and decoded the ACPI tables, finding the values to change. A compact record is below, but I recommend reading the article by Major Hayden, and it is a journey.
# install tools
# dump ACPI tables
# decompile the tables and move raw data elsewhere
# try to find a string containing "GPU"
for; do && ; done
# determine the code context similarity by hand
# and apply the mods
Running a grep on the .dsl files in the directory shows some mentions in the ssdt15.dsl:
421 Method (GC6O, 0, Serialized)
422 {
423 LKD1 = Zero
424 \_SB.PCI0.LPCB.GEVT = Zero
425 \_SB.PCI0.LPCB.TXDS = Zero
426 While ((\_SB.PCI0.LPCB.FBEN != Zero))
427 {
428 Sleep (One)
429 }
430
431 \_SB.PCI0.LPCB.GEVT = One
432 While ((LKS1 < 0x07))
433 {
434 Sleep (One)
435 }
436
437 LREN = LTRE /* \_SB_.PCI0.RP01.LTRE */
438 CEDR = One
439 \_SB.PCI0.LPCB.EC.GPUT = Zero
440 Sleep (0x64)
441 }
About DSDT/ASL/DSL/ACPI source language, more information can be found at Arch Wiki: DSDT.
I can tell there is something. Anyway I'll take a leap.
Continue following the instructions by Major Hayden, locate related HGON ()
and surround them with conditional blocks, update the version string. Here is my patch.
-DefinitionBlock ("", "SSDT", 2, "LENOVO", "SgPch", 0x00001000)
+DefinitionBlock ("", "SSDT", 2, "LENOVO", "SgPch", 0x00001001)
{
External (_PR_.PR00._PSS, MethodObj) // 0 Arguments
External (_SB_.PCI0, DeviceObj)
PCMR = 0x07
PWRS = Zero
Sleep (0x10)
- \_SB.PCI0.HGON ()
+ // Set kernel param `acpi_osi='T480-Hybrid-Graphics'` to enable dGPU
+ If (\_OSI ("T480-Hybrid-Graphics"))
+ {
+ \_SB.PCI0.HGON ()
+ }
+ Else
+ {
+ \_SB.PCI0.HGOF ()
+ }
_STA = One
DGIZ = One
}
Method (_ON, 0, Serialized) // _ON_: Power On
{
- \_SB.PCI0.HGON ()
+ // Set kernel param `acpi_osi='T480-Hybrid-Graphics'` to enable dGPU
+ If (\_OSI ("T480-Hybrid-Graphics"))
+ {
+ \_SB.PCI0.HGON ()
+ }
+ Else
+ {
+ \_SB.PCI0.HGOF ()
+ }
Return (Zero)
}
* Compiler ID "INTL"
* Compiler Version 0x20160527 (538314023)
*/
Now we need to compile the changed DSDT and create a CPIO image to patch it on boot.
# compile the dsl, then get aml and hex
# create the cpio image structure
|
I boot with EFI stub, so I need to update the unicode string in the stub. Remember to change the place holders.
Then reboot the computer, check the logs:
# dmesg | egrep -i "ssdt|dsdt"
[ 0.021941] ACPI: Table Upgrade: override [SSDT-LENOVO- SgPch]
[ 0.021944] ACPI: SSDT 0x000000005B5B3000 Physical table override, new table: 0x00000000599F3000
[ 0.021948] ACPI: SSDT 0x00000000599F3000 00176A (v02 LENOVO SgPch 00001001 INTL 20221020)
and do check PCIe devices with lspci
.
Success! Now I can no longer see MX150 attached.
Now you can add acpi_osi='T480-Hybrid-Graphics' to your kernel command line whenever you want to use your Nvidia card.
Again, credits to Major Hayden π₯³.
What about the eGPU?π
Oh yes, this should be an article about eGPU on T480. After disabling the internal dGPU aka MX150, the proprietary driver works as well.
To summary everything, these all mess could be avoided if NVIDIA have a good support for Linux driver(on Windows/LTSC2019 it's fine after all). Disabling MX150 somehow fully works around the problem(and brings the potential of saving more battery). Apart from those, this is just a plug-and-play work.
Isn't there any issue with eGPU?π
PRIMEπ
No issue.
Deep learningπ
Works great.
Hot-unplugπ
This is on GDM/GNOME/Wayland/ArchLinux.
Still causing issues. After hot unplug, the computer won't shutdown properly.
A near-perfect solution is found. You can add a custom udev rule file to tag the eGPU as ignored for mutter. PRIME is usable, but you'll lose output on eGPU.
# card0: UHD620
# card1: NVIDIA MX150 if available, otherwise egpu
# cardN: eGPUs
ENV{DEVNAME}=="/dev/dri/card0", TAG+="mutter-device-preferred-primary"
# Use this line to let mutter ignore NVIDIA devices
SUBSYSTEM=="drm", DRIVERS=="nvidia", TAG+="mutter-device-ignore"
Another solution is that when it's desired to detach the eGPU, stop all program using any resource on the eGPU, including GDM/gnome-shell, then unload the driver modules, finally unplug the eGPU.
eGPU outputπ
Well, on GNOME, it's plug-and-play, with some performance issue. You need to add a custom udev rule to run the session on eGPU, and a re-login is required. Otherwise, the output on eGPU will have VERY LOW framerate and is LAGGY. This is a known issue.