Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Samsung Pro Plus SD card "mmc0: running CQE recovery" errors #6561

Open
ongardie opened this issue Dec 21, 2024 · 3 comments
Open

Samsung Pro Plus SD card "mmc0: running CQE recovery" errors #6561

ongardie opened this issue Dec 21, 2024 · 3 comments

Comments

@ongardie
Copy link

Describe the bug

I have a fairly new SAMSUNG PRO Plus 256 GB microSD card. It's manufactured in June 2024 according to software. The model code on the package is MB-MD256SA/AM. The ratings include UHS 1, C10, U3, V30, A2. Details are below.

This is in a new Raspberry Pi 5 (8 GB).

I observed apt was slow to unpack packages and looked in dmesg and found many errors like:

mmc0: running CQE recovery

After some searching, I came across this forum post: "Testing class A2 SD Cards with Command Queueing on Pi 5" https://forums.raspberrypi.com/viewtopic.php?t=367459
Based on that, I turned dtparam=sd_cqe=off, which appears to work around the problem.

This commit c9b61d1 disables CQE for similar cards that were made in 2023. My card from June 2024 appears to need this same quirk.

Let me know if you'd like any additional tests done with this card (even potentially destructive ones).

Steps to reproduce the behaviour

Install Raspberry Pi OS (64-bit) and upgrade the kernel. Run dmesg -w.

Device (s)

Raspberry Pi 5

System

$ cat /etc/rpi-issue
Raspberry Pi reference 2024-11-19
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 891df1e21ed2b6099a2e6a13e26c91dea44b34d4, stage2
$ vcgencmd version
2024/11/12 16:10:44 
Copyright (c) 2012 Broadcom
version 4b019946 (release) (embedded)
$ uname -a
Linux ddr 6.6.62+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.6.62-1+rpt1 (2024-11-25) aarch64 GNU/Linux

Logs

$ sudo cat /sys/kernel/debug/mmc0/err_stats
# Command Timeout Occurred:	 4420
# Command CRC Errors Occurred:	 0
# Data Timeout Occurred:	 0
# Data CRC Errors Occurred:	 0
# Auto-Cmd Error Occurred:	 0
# ADMA Error Occurred:	 0
# Tuning Error Occurred:	 0
# CMDQ RED Errors:	 0
# CMDQ GCE Errors:	 0
# CMDQ ICCE Errors:	 0
# Request Timedout:	 0
# CMDQ Request Timedout:	 0
# ICE Config Errors:	 0
# Controller Timedout errors:	 0
# Unexpected IRQ errors:	 0
$ find /sys/bus/mmc/devices/mmc0\:*/ -type f -not -path "*block*" -print -exec cat {} \;
/sys/bus/mmc/devices/mmc0:0001/fwrev
0x0
/sys/bus/mmc/devices/mmc0:0001/uevent
DRIVER=mmcblk
MMC_TYPE=SD
MMC_NAME=FE4S9
MODALIAS=mmc:block
/sys/bus/mmc/devices/mmc0:0001/cid
1b534d464534533930da0c56a9a18600
/sys/bus/mmc/devices/mmc0:0001/rca
0x0001
/sys/bus/mmc/devices/mmc0:0001/ext_power
07
/sys/bus/mmc/devices/mmc0:0001/csd
400e0032db79000775ff7f800a400000
/sys/bus/mmc/devices/mmc0:0001/manfid
0x00001b
/sys/bus/mmc/devices/mmc0:0001/power/runtime_active_time
909457
/sys/bus/mmc/devices/mmc0:0001/power/runtime_status
active
/sys/bus/mmc/devices/mmc0:0001/power/autosuspend_delay_ms
3000
/sys/bus/mmc/devices/mmc0:0001/power/runtime_suspended_time
0
/sys/bus/mmc/devices/mmc0:0001/power/control
auto
/sys/bus/mmc/devices/mmc0:0001/ocr
0x00300000
/sys/bus/mmc/devices/mmc0:0001/preferred_erase_size
4194304
/sys/bus/mmc/devices/mmc0:0001/ext_perf
1f
/sys/bus/mmc/devices/mmc0:0001/type
SD
/sys/bus/mmc/devices/mmc0:0001/hwrev
0x3
/sys/bus/mmc/devices/mmc0:0001/date
06/2024
/sys/bus/mmc/devices/mmc0:0001/dsr
0x404
/sys/bus/mmc/devices/mmc0:0001/erase_size
512
/sys/bus/mmc/devices/mmc0:0001/removable
removable
/sys/bus/mmc/devices/mmc0:0001/oemid
0x534d
/sys/bus/mmc/devices/mmc0:0001/serial
0xda0c56a9
/sys/bus/mmc/devices/mmc0:0001/ssr
0000000008000000040090000811391e000800000002ff0003000000000000000000000000000000000000000000000000000000000000000000000000000000
/sys/bus/mmc/devices/mmc0:0001/scr
0205848700000000
/sys/bus/mmc/devices/mmc0:0001/name
FE4S9
$ dmesg | grep mmc
[    2.455057] sdhci-brcmstb 1000fff000.mmc: Got CD GPIO
[    2.460567] mmc1: CQHCI version 5.10
[    2.460992] mmc0: CQHCI version 5.10
[    2.509689] mmc0: SDHCI controller on 1000fff000.mmc [1000fff000.mmc] using ADMA 64-bit
[    2.656129] mmc1: SDHCI controller on 1001100000.mmc [1001100000.mmc] using ADMA 64-bit
[    2.686421] mmc0: Command Queue Engine enabled, 31 tags
[    2.686954] mmc0: new ultra high speed SDR104 SDXC card at address 0001
[    2.703060] mmcblk0: mmc0:0001 FE4S9 239 GiB
[    2.703972]  mmcblk0: p1 p2
[    2.709150] mmc1: new ultra high speed DDR50 SDIO card at address 0001
[    2.717856] mmcblk0: mmc0:0001 FE4S9 239 GiB
[    3.008563] mmc0: running CQE recovery
[    3.341653] mmc0: running CQE recovery
[    3.638679] mmc0: running CQE recovery
[    3.939400] EXT4-fs (mmcblk0p2): mounted filesystem ce208fd3-38a8-424a-87a2-cd44114eb820 ro with ordered data mode. Quota mode: none.
[    4.333574] mmc0: running CQE recovery
[    5.173493] EXT4-fs (mmcblk0p2): re-mounted ce208fd3-38a8-424a-87a2-cd44114eb820 r/w. Quota mode: none.
[    5.220760] mmc0: running CQE recovery
[    5.547683] mmc0: running CQE recovery
[    5.839327] mmc0: running CQE recovery
[    6.131881] mmc0: running CQE recovery
[    6.425648] mmc0: running CQE recovery
[    6.728023] mmc0: running CQE recovery
[    9.038932] mmc_cqe_recovery: 3 callbacks suppressed
[    9.038935] mmc0: running CQE recovery
[    9.341462] mmc0: running CQE recovery
[   10.191622] mmc0: running CQE recovery
[   10.486555] mmc0: running CQE recovery
[   10.784577] mmc0: running CQE recovery

The CQE recoveries continue.

Additional context

No response

@FibreFoX
Copy link

I had similar problems, there seems to be a broken Command Queueing Engine. There was a change of default configuration some time ago https://github.com/raspberrypi/linux/blame/811ff707533bcd67cdcd368bbd46223082009b12/arch/arm/boot/dts/overlays/README#L408
This change was done with this commit: 48a15bc

Please add this to your /boot/firmware/config.txt:

dtparams=sd_cqe=off

@FibreFoX
Copy link

Just a small update: when using the new 6.13.y kernel this is not an issue anymore, I suspect the latest changes regarding the cqe detection solved this. The change in the config.txt is no longer needed for me. (Haven't tested 6.12.y yet)

@popcornmix
Copy link
Collaborator

See: #6591

6.6, 6.12, 6.13 should now have CQE disabled by default, except for whitelisted cards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants