You are not logged in.

#1 Yesterday 17:01:09

reloadedd
Member
Registered: 2023-07-06
Posts: 16

Freeze after login whenever upgrading nvidia/kernel

Hi everyone.

This issue is exactly this one, posted by me, 8 months ago: https://bbs.archlinux.org/viewtopic.php?id=309115. TLDR: Whenever I upgrade the kernel and/or nvidia driver, at next reboot after logging in (i.e. typing user & password and hitting ENTER) => the system completely freezes. Sometimes SysRq reboot works, sometimes it doesn't, therefore I have to hold the power button of the laptop to manually power it off.

A journalctl of a failed attempt (on this boot the whole system froze and I was able to reboot using SysRq reboot) -- also, in this attempt I've tried the latest kernel (7.0.10-arch1-1) and the latest nvidia-open-dkms (610.43.02) versions: https://paste.c-net.org/QuickerNicest.

The setup with which I was stuck for the past 8 months was nvidia-dkms (580.82.09-1) and linux kernel (6.16.0-arch2-1). For the last few days I've tried different combinations of kernel & nvidia driver versions. I've managed to upgrade my nvidia driver version, but not the kernel. Currently I'm having installed the latest nvidia-580xx-dkms version:

➜  ~ pacman -Qi nvidia-580xx-
…ia-580xx-dkms  (580.159.04-1)  …ia-580xx-settings  (580.159.04-1)  …ia-580xx-utils  (580.159.04-1)
# This is fish shell, I've pressed TAB in the command above for the shell to perform an autocomplete that shows the packages and their versions, not actually executed the command.

How did I got to install this package? By following the ArchWiki.

~ lspci -k -d ::03xx
# ... SKIPPING THE INTEGRATED GPU ...
0000:01:00.0 VGA compatible controller: NVIDIA Corporation AD107M [GeForce RTX 4060 Max-Q / Mobile] (rev a1)
       Subsystem: ASUSTeK Computer Inc. Device 20bd
       Kernel driver in use: nvidia
       Kernel modules: nouveau, nvidia_drm, nvidia

My GPU is AD107M, part of the NV190 family (Ada Lovelace). According to the table in the Wiki, I need to install either nvidia-open or nvidia-580xx-dkms (I've also tried nvidia-open). There is also a note on the Wiki:

NVIDIA's GSP firmware is known to cause issues, from suboptimal power management of Turing GPUs to complete failure on some laptops containing Ampere GPUs. If affected, use the proprietary driver (e.g. nvidia-580xx-dkmsAUR) with the module parameter NVreg_EnableGpuFirmware=0 instead.

I've also set the kernel parameter NVreg_EnableGpuFirmware=0, however to no avail, nothing changed. I've also tried different kernels with the current nvidia driver version using the downgrade tool:

-  631)  linux    7.0.10.arch1   1  /var/cache/pacman/pkg  2026-05-23 
-  608)  linux    6.18.9.arch1   2  /var/cache/pacman/pkg  2026-02-09
-  594)  linux    6.17.9.arch1   1  /var/cache/pacman/pkg  2025-11-2
-  581)  linux    6.16.10.arch1  1  /var/cache/pacman/pkg  2025-10-02

Those were the kernel versions that I've tested and all resulted on the same freeze after login. It seems that I'm stuck with 6.16.0-arch2-1 for now (... and since almost a year). It's also surprinsing to me that it also froze with kernel version 6.16.10 since it's 6.16.x.

Current system specs & versions:

➜ ~ neofetch
OS: Arch Linux x86_64
DE: Plasma 6.6.5
WM: kwin
CPU: 13th Gen Intel i9-13980HX (32) @ 5.400GHz
GPU: Intel Raptor Lake-S UHD Graphics
GPU: NVIDIA GeForce RTX 4060 Max-Q / Mobile

➜  ~ nvidia-smi
Tue Jun  2 19:58:00 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.159.04             Driver Version: 580.159.04     CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+

# Laptop model
ASUSTeK COMPUTER INC. ROG Strix G614JV_G614JV/G614JV, BIOS G614JV.333 10/01/2025

Offline

#2 Yesterday 20:01:11

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 25,204

Re: Freeze after login whenever upgrading nvidia/kernel

it "looks" like things are mostly fine, nothing really screaming nvidia issue here. What happens if you simplify the running services? e.g. try disabling supergfxd,asusd,laptop-mode-tools and friends. The old microcode message could potentially be contributing, you might want to setup https://wiki.archlinux.org/title/Microcode and/or update your UEFI to make sure you have the newer revisions.

Offline

#3 Yesterday 20:47:57

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,718

Re: Freeze after login whenever upgrading nvidia/kernel

Whenever I upgrade the kernel and/or nvidia driver, at next reboot after logging in (i.e. typing user & password and hitting ENTER) => the system completely freezes. Sometimes SysRq reboot works

And the problem is completely gone w/ a subsequent reboot?
You *have* tested only changing the nvidia driver, w/o altering the kernel (whether updating the driver or switching between nvidia-open and 580xx-dkms)?

May 31 10:02:21 Mithras kernel: nvidia 0000:01:00.0: [drm] Cannot find any crtc or sizes

There're no outputs attached to the nvidia GPU.

The log has a login on TTY3 (after fatfingering the password wink) but there seems no attempt to start a session from SDDM?
You'd typically log into plasma/wayland? Is this also a problem w/
- plasma/X11
- weston
- openbox
?

May 31 10:02:21 Mithras kernel:  nvme0n1: p1 p2 p3 p4 p5
May 31 10:02:21 Mithras kernel:  nvme1n1: p1 p2 p3 p4

Sanity check: How many parallel windows installations are there?

Offline

#4 Today 21:59:37

reloadedd
Member
Registered: 2023-07-06
Posts: 16

Re: Freeze after login whenever upgrading nvidia/kernel

V1del wrote:

What happens if you simplify the running services? e.g. try disabling supergfxd,asusd,laptop-mode-tools and friends.

Nothing, apparently. This is the list of everything that I've uninstalled (pacman -Rus ...) -- did I miss any friend?:

Name            : supergfxctl
Version         : 5.2.7-2
Description     : A utility for Linux graphics switching on Intel/AMD iGPU + nVidia dGPU laptops
Architecture    : x86_64
URL             : https://gitlab.com/asus-linux/supergfxctl
Licenses        : MPL-2.0
Groups          : None
Provides        : supergfxctl
Depends On      : gcc-libs  systemd  lsof
Optional Deps   : None
Required By     : asusctltray-git
Optional For    : asusctl
Conflicts With  : supergfxctl-git  optimus-manager
Replaces        : None
Installed Size  : 5.81 MiB
Packager        : Garuda Builder <team@garudalinux.org>
Build Date      : Wed 18 Jun 2025 11:46:28 AM EEST
Install Date    : Sun 24 May 2026 11:24:37 PM EEST
Install Reason  : Explicitly installed
Install Script  : No
Validated By    : SHA-256 Sum  Signature

Name            : asusctltray-git
Version         : r28.c8ef0ba-1
Description     : Simple tray profile switcher for asusctl
Architecture    : any
URL             : https://github.com/Baldomo/asusctltray
Licenses        : MIT
Groups          : None
Provides        : asusctltray
Depends On      : python3  supergfxctl  asusctl  dbus  dbus-python
Optional Deps   : None
Required By     : None
Optional For    : asusctl
Conflicts With  : asusctltray
Replaces        : None
Installed Size  : 28.54 KiB
Packager        : Unknown Packager
Build Date      : Wed 08 Jan 2025 09:06:12 AM EET
Install Date    : Wed 08 Jan 2025 09:06:14 AM EET
Install Reason  : Explicitly installed
Install Script  : No
Validated By    : None

Name            : rog-control-center
Version         : 6.3.7-1
Description     : App to control asusctl
Architecture    : x86_64
URL             : https://asus-linux.org
Licenses        : MPL-2.0
Groups          : None
Provides        : None
Depends On      : asusctl  fontconfig  freetype2  glibc  hicolor-icon-theme  libayatana-appindicator  libgcc  libinput  libxkbcommon  mesa  seatd  systemd-libs
Optional Deps   : None
Required By     : None
Optional For    : asusctl
Conflicts With  : None
Replaces        : None
Installed Size  : 20.39 MiB
Packager        : Garuda Builder <team@garudalinux.org>
Build Date      : Thu 16 Apr 2026 03:47:38 AM EEST
Install Date    : Thu 16 Apr 2026 08:07:16 PM EEST
Install Reason  : Explicitly installed
Install Script  : No
Validated By    : SHA-256 Sum  Signature

Name            : asusctl
Version         : 6.3.7-1
Description     : A control daemon, CLI tools, and a collection of crates for interacting with ASUS ROG laptops
Architecture    : x86_64
URL             : https://asus-linux.org
Licenses        : MPL-2.0
Groups          : None
Provides        : None
Depends On      : glibc  libgcc  libusb  systemd  systemd-libs
Optional Deps   : acpi_call: fan control [installed]
                  asusctltray: tray profile switcher
                  rog-control-center: app to control asusctl
                  supergfxctl: hybrid GPU control
Required By     : None
Optional For    : None
Conflicts With  : gnome-shell-extension-asusctl-gnome
Replaces        : None
Installed Size  : 23.13 MiB
Packager        : Garuda Builder <team@garudalinux.org>
Build Date      : Thu 16 Apr 2026 03:47:38 AM EEST
Install Date    : Thu 16 Apr 2026 08:07:14 PM EEST
Install Reason  : Explicitly installed
Install Script  : Yes
Validated By    : SHA-256 Sum  Signature
V1del wrote:

The old microcode message could potentially be contributing [...]

Installed intel-ucode (had to manually remove old /boot/intel-ucode.img (maybe some leftover?) to install it, but now looking at journal logs it seems that the new microcode (0x00000133) is being loaded. The BIOS version (333) is the latest, checked the Asus website for my laptop.

seth wrote:

And the problem is completely gone w/ a subsequent reboot?

Nope, actually tried right now when testing another kernel and the changes I've mentioned above. I remember now back in the day trying different things then coming back the next day discovering that I've forget to fix it the night before and the system froze after logging in. It persists across reboots, the only way to fix it is to revert to a state it previously worked.

seth wrote:

You *have* tested only changing the nvidia driver, w/o altering the kernel (whether updating the driver or switching between nvidia-open and 580xx-dkms)?

I've tested all 3 packages (nvidia-open, 580xx-dkms, nvidia-open-dkms) with different kernels. 580xx-dkms is working for the current version of kernel (6.16.0-arch2-1). Scenarios tested:
- nvidia-open/nvidia-open-dkms tested with the current kernel that's working with 580xx-dkms
- All three with a newer kernel

So, even though 580xx-dkms is working with the current kernel, when I upgrade to 6.16.10-arch1-1 or 7.0.10-arch1-1 (this is what I've tested today), it freezes like before after login.

seth wrote:

There're no outputs attached to the nvidia GPU.

In that log, I've used only my laptop screen (no external display connected) - from what I know, the HDMI slot plugs into the GPU. I have an external 4K TV (LG C3) that goes into screensaver mode while I'm trying things and rebooting and it blinds me at full brightness, this is why I don't connect it when debugging. I've also tried with it connected and it also freezes.

seth wrote:

You'd typically log into plasma/wayland?

That's right, that's the setup.  Together with SDDM.

seth wrote:

but there seems no attempt to start a session from SDDM?

I don't know to read the log that well, but there should be because I've entered my credentials and pressed enter. Then it froze and I've rebooted with SysRq Reboot. Regarding the TTY3 (with the wrong pw attempt, oops) it was me probably checking the kernel version or something in another terminal (CTRL + ALT + F3). Also, I've tried once starting Plasma from such a terminal via startplasma-wayland command, it also froze.

seth wrote:

Sanity check: How many parallel windows installations are there?

Just one. 2 SSDs. 1 SSD (1 TB) for Windows, 1 SSD (2 TB) splitted between whole Linux install AND a partition of around 700GB on Windows (for storing things).

➜  sudo fdisk -l
Disk /dev/nvme1n1: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: SHPP41-2000GM                           
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: C42031DD-51F5-42FF-B8B7-C472A5EFDF76

Device              Start        End    Sectors  Size Type
/dev/nvme1n1p1         34      32767      32734   16M Microsoft reserved
/dev/nvme1n1p2      32768 1484832767 1484800000  708G Microsoft basic data
/dev/nvme1n1p3 1484832768 1487978495    3145728  1.5G EFI System
/dev/nvme1n1p4 1487978496 1521532927   33554432   16G Linux swap
/dev/nvme1n1p5 1521532928 3907028991 2385496064  1.1T Linux filesystem


Disk /dev/nvme0n1: 953.87 GiB, 1024209543168 bytes, 2000409264 sectors
Disk model: WD PC SN560 SDDPNQE-1T00-1002           
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 65F8C023-0C77-4CE3-B2EF-05244D9C2AC6

Device              Start        End    Sectors   Size Type
/dev/nvme0n1p1       2048     206847     204800   100M EFI System
/dev/nvme0n1p2     206848     239615      32768    16M Microsoft reserved
/dev/nvme0n1p3     239616 1998432255 1998192640 952.8G Microsoft basic data
/dev/nvme0n1p4 1998432256 2000406527    1974272   964M Windows recovery environment


Disk /dev/mapper/root: 1.11 TiB, 1221357207552 bytes, 2385463296 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Here are two new journalctls from today after applying the changes:
- Upgrade to kernel  6.16.10-arch1-1 with the same 580xx-dkms nvidia driver, which resulted in the same system freeze and rebooted with SysRq: https://paste.c-net.org/BulldogTorches
- Successful login, after downgrading back to 6.16.0-arch2-1; this is the current boot on which I'm writing this post: https://paste.c-net.org/TelegramTwinge

Offline

#5 Today 22:35:48

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,718

Re: Freeze after login whenever upgrading nvidia/kernel

the only way to fix it is to revert to a state it previously worked.

So basically you've this problem w/ any kernel after 6.16.0 ?

Is "nvidia-modeset.hdmi_deepcolor=1" a non-working, stale effort to mitigate the problem?

Just one.

Was a trick question, please see teh 3rd link below. Mandatory.
Disable it (it's NOT the BIOS setting!) and reboot windows and linux twice for voodo reasons.

https://paste.c-net.org/BulldogTorches does't reboot w/ sysrq and doesn't have an attempt to login from SDDM either.
https://paste.c-net.org/TelegramTwinge shows you logging in into plasma/wayland - for some reason you're also starting picom.

Jun 04 00:18:56 Mithras intel-undervolt[1615]: CPU (0): -139.65 mV
Jun 04 00:18:56 Mithras intel-undervolt[1615]: GPU (1): -49.80 mV
Jun 04 00:18:56 Mithras intel-undervolt[1615]: CPU Cache (2): -125.00 mV
Jun 04 00:18:56 Mithras intel-undervolt[1615]: System Agent (3): -49.80 mV
Jun 04 00:18:56 Mithras intel-undervolt[1615]: Analog I/O (4): -0.00 mV

Disable that, too.

Can you log into plasma/X11 w/ kernels > 6.16.0?
Fwwi, I don't think your nvidia GPU is relevant here at all

Offline

Board footer

Powered by FluxBB