Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Nvidia GeForce RTX 4090 #504

Open
geerlingguy opened this issue Jan 11, 2023 · 6 comments
Open

Test Nvidia GeForce RTX 4090 #504

geerlingguy opened this issue Jan 11, 2023 · 6 comments

Comments

@geerlingguy
Copy link
Owner

geerlingguy commented Jan 11, 2023

I bought an Nvidia RTX 4090, specifically the Gigabyte GeForce RTX 4090 OC, and I'd like to see what happens with it on a Pi:

gpu-nvidia-rtx-4090-oc

I would like to see what happens.

@geerlingguy
Copy link
Owner Author

$ sudo lspci -vvvv
01:00.0 VGA compatible controller: NVIDIA Corporation Device 2684 (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Gigabyte Technology Co., Ltd Device 40bf
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 0
	Region 0: Memory at 618000000 (32-bit, non-prefetchable) [disabled] [size=16M]
	Region 1: Memory at 600000000 (64-bit, prefetchable) [disabled] [size=256M]
	Region 3: Memory at 610000000 (64-bit, prefetchable) [disabled] [size=32M]
	Region 5: I/O ports at <unassigned> [disabled]
	Expansion ROM at 619000000 [virtual] [disabled] [size=512K]
	Capabilities: [60] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [78] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x16, ASPM L1, Exit Latency L1 <4us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 5GT/s (downgraded), Width x1 (downgraded)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp+ 10BitTagReq+ OBFF Via message, ExtFmt- EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled,
			 AtomicOpsCtl: ReqEn-
		LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete- EqualizationPhase1-
			 EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [b4] Vendor Specific Information: Len=14 <?>
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-
	Capabilities: [250 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Capabilities: [258 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=255us PortTPowerOnTime=10us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [128 v1] Power Budgeting <?>
	Capabilities: [420 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn- PerformEqu-
		LaneErrStat: 0
	Capabilities: [bb0 v1] Physical Resizable BAR
		BAR 0: current size: 16MB, supported: 16MB
		BAR 1: current size: 256MB, supported: 64MB 128MB 256MB 512MB 1GB 2GB 4GB 8GB 16GB 32GB
		BAR 3: current size: 32MB, supported: 32MB
	Capabilities: [c1c v1] Physical Layer 16.0 GT/s <?>
	Capabilities: [d00 v1] Lane Margining at the Receiver <?>
	Capabilities: [e00 v1] Data Link Feature <?>
	Kernel modules: nouveau

01:00.1 Audio device: NVIDIA Corporation Device 22ba (rev a1)
	Subsystem: Gigabyte Technology Co., Ltd Device 40bf
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin B routed to IRQ 0
	Region 0: Memory at 619080000 (32-bit, non-prefetchable) [disabled] [size=16K]
	Capabilities: [60] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [78] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x16, ASPM L1, Exit Latency L1 <4us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 5GT/s (downgraded), Width x1 (downgraded)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp+ 10BitTagReq+ OBFF Via message, ExtFmt- EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS- TPHComp- ExtTPHComp-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
			 AtomicOpsCtl: ReqEn-
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete- EqualizationPhase1-
			 EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [160 v1] Data Link Feature <?>

@geerlingguy
Copy link
Owner Author

Ran the following to upgrade the system and prepare for Nvidia's driver install:

sudo apt update
sudo apt -y dist-upgrade
sudo apt install -y raspberrypi-kernel-headers
sudo reboot

Then I downloaded the latest ARM64 driver, which in my case was 525.105.17, and copied it over to the Downloads folder on the Pi.

Then I installed it:

chmod +x NVIDIA-Linux-aarch64-525.105.17.run
sudo ./NVIDIA-Linux-aarch64-525.105.17.run

After a reboot, I ran startx and got a nice smattering of errors, practically identical to what I saw with the RTX 8000:

pi@pi4gpu:~ $ cat /home/pi/.local/share/xorg/Xorg.0.log
[   130.693] 
X.Org X Server 1.20.11
X Protocol Version 11, Revision 0
[   130.693] Build Operating System: linux Debian
[   130.693] Current Operating System: Linux pi4gpu 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr  3 17:24:16 BST 2023 aarch64
[   130.693] Kernel command line: coherent_pool=1M 8250.nr_uarts=0 snd_bcm2835.enable_headphones=0 snd_bcm2835.enable_hdmi=1 snd_bcm2835.enable_hdmi=0  smsc95xx.macaddr=E4:5F:01:4E:F0:36 vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000  console=ttyS0,115200 console=tty1 root=PARTUUID=4702f6c0-02 rootfstype=ext4 fsck.repair=yes rootwait quiet splash plymouth.ignore-serial-consoles
[   130.694] Build Date: 30 March 2023  01:46:14PM
[   130.694] xorg-server 2:1.20.11-1+rpt3+deb11u6 (https://www.debian.org/support) 
[   130.694] Current version of pixman: 0.40.0
[   130.694] 	Before reporting problems, check http://wiki.x.org
	to make sure that you have the latest version.
[   130.694] Markers: (--) probed, (**) from config file, (==) default setting,
	(++) from command line, (!!) notice, (II) informational,
	(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[   130.695] (==) Log file: "/home/pi/.local/share/xorg/Xorg.0.log", Time: Fri Apr  7 09:21:43 2023
[   130.695] (==) Using config file: "/etc/X11/xorg.conf"
[   130.695] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[   130.696] (==) ServerLayout "Layout0"
[   130.696] (**) |-->Screen "Screen0" (0)
[   130.696] (**) |   |-->Monitor "Monitor0"
[   130.697] (**) |   |-->Device "Device0"
[   130.697] (**) |-->Input Device "Keyboard0"
[   130.697] (**) |-->Input Device "Mouse0"
[   130.697] (**) Option "Debug" "dmabuf_capable"
[   130.697] (==) Automatically adding devices
[   130.697] (==) Automatically enabling devices
[   130.697] (==) Automatically adding GPU devices
[   130.698] (==) Max clients allowed: 256, resource mask: 0x1fffff
[   130.698] (WW) The directory "/usr/share/fonts/X11/misc" does not exist.
[   130.698] 	Entry deleted from font path.
[   130.698] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist.
[   130.698] 	Entry deleted from font path.
[   130.698] (WW) The directory "/usr/share/fonts/X11/100dpi/" does not exist.
[   130.698] 	Entry deleted from font path.
[   130.698] (WW) The directory "/usr/share/fonts/X11/75dpi/" does not exist.
[   130.698] 	Entry deleted from font path.
[   130.698] (WW) The directory "/usr/share/fonts/X11/Type1" does not exist.
[   130.698] 	Entry deleted from font path.
[   130.698] (WW) The directory "/usr/share/fonts/X11/100dpi" does not exist.
[   130.698] 	Entry deleted from font path.
[   130.698] (WW) The directory "/usr/share/fonts/X11/75dpi" does not exist.
[   130.698] 	Entry deleted from font path.
[   130.698] (==) FontPath set to:
	built-ins
[   130.698] (==) ModulePath set to "/usr/lib/xorg/modules"
[   130.698] (WW) Hotplugging is on, devices using drivers 'kbd', 'mouse' or 'vmmouse' will be disabled.
[   130.698] (WW) Disabling Keyboard0
[   130.698] (WW) Disabling Mouse0
[   130.698] (II) Loader magic: 0x555bb6de28
[   130.698] (II) Module ABI versions:
[   130.698] 	X.Org ANSI C Emulation: 0.4
[   130.698] 	X.Org Video Driver: 24.1
[   130.698] 	X.Org XInput driver : 24.1
[   130.699] 	X.Org Server Extension : 10.0
[   130.701] (++) using VT number 1

[   130.706] (II) systemd-logind: took control of session /org/freedesktop/login1/session/_31
[   130.710] (II) xfree86: Adding drm device (/dev/dri/card1)
[   130.713] (II) systemd-logind: got fd for /dev/dri/card1 226:1 fd 11 paused 0
[   130.715] (II) xfree86: Adding drm device (/dev/dri/card0)
[   130.718] (II) systemd-logind: got fd for /dev/dri/card0 226:0 fd 12 paused 0
[   130.721] (--) PCI:*(1@0:0:0) 10de:2684:1458:40bf rev 161, Mem @ 0x618000000/16777216, 0x600000000/268435456, 0x610000000/33554432, BIOS @ 0x????????/524288
[   130.721] (II) LoadModule: "glx"
[   130.722] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[   130.726] (II) Module glx: vendor="X.Org Foundation"
[   130.726] 	compiled for 1.20.11, module version = 1.0.0
[   130.726] 	ABI class: X.Org Server Extension, version 10.0
[   130.726] (II) LoadModule: "nvidia"
[   130.727] (II) Loading /usr/lib/xorg/modules/drivers/nvidia_drv.so
[   130.728] (II) Module nvidia: vendor="NVIDIA Corporation"
[   130.728] 	compiled for 1.6.99.901, module version = 1.0.0
[   130.728] 	Module class: X.Org Video Driver
[   130.728] (II) NVIDIA dlloader X Driver  525.105.17  Tue Mar 28 17:51:36 UTC 2023
[   130.728] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[   130.729] (II) Loading sub module "fb"
[   130.729] (II) LoadModule: "fb"
[   130.729] (II) Loading /usr/lib/xorg/modules/libfb.so
[   130.730] (II) Module fb: vendor="X.Org Foundation"
[   130.730] 	compiled for 1.20.11, module version = 1.0.0
[   130.730] 	ABI class: X.Org ANSI C Emulation, version 0.4
[   130.730] (II) Loading sub module "wfb"
[   130.730] (II) LoadModule: "wfb"
[   130.730] (II) Loading /usr/lib/xorg/modules/libwfb.so
[   130.731] (II) Module wfb: vendor="X.Org Foundation"
[   130.731] 	compiled for 1.20.11, module version = 1.0.0
[   130.731] 	ABI class: X.Org ANSI C Emulation, version 0.4
[   130.731] (II) Loading sub module "ramdac"
[   130.731] (II) LoadModule: "ramdac"
[   130.731] (II) Module "ramdac" already built-in
[   130.732] (II) systemd-logind: releasing fd for 226:1
[   130.734] (II) systemd-logind: releasing fd for 226:0
[   130.737] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
[   130.737] (**) NVIDIA(0): Depth 24, (--) framebuffer bpp 32
[   130.737] (==) NVIDIA(0): RGB weight 888
[   130.737] (==) NVIDIA(0): Default visual is TrueColor
[   130.737] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
[   130.737] (EE) 
[   130.737] (EE) Backtrace:
[   130.739] (EE) 0: /usr/lib/xorg/Xorg (OsLookupColor+0x188) [0x555bacc538]
[   130.740] (EE) unw_get_proc_info failed: no unwind info found [-10]
[   130.740] (EE) 
[   130.740] (EE) Segmentation fault at address 0x124
[   130.740] (EE) 
Fatal server error:
[   130.740] (EE) Caught signal 11 (Segmentation fault). Server aborting
[   130.740] (EE) 
[   130.740] (EE) 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
[   130.741] (EE) Please also check the log file at "/home/pi/.local/share/xorg/Xorg.0.log" for additional information.
[   130.741] (EE) 
[   130.744] (EE) Server terminated with error (1). Closing log file.

@geerlingguy
Copy link
Owner Author

That's also the same issue I ran into with the M2_VGA: #62 (comment)

@geerlingguy
Copy link
Owner Author

@geerlingguy geerlingguy changed the title Test Nvidia RTX 4090 with CM4 Test Nvidia RTX 4090 Nov 16, 2023
@geerlingguy geerlingguy changed the title Test Nvidia RTX 4090 Test Nvidia GeForce RTX 4090 Nov 16, 2023
@geerlingguy geerlingguy reopened this Nov 16, 2023
@geerlingguy
Copy link
Owner Author

Well now... I have a Raspberry Pi 5 and a PCIe slot. How does that fare?

@HeyMeco
Copy link

HeyMeco commented Nov 19, 2023

I think it's interesting how an RK3568 in comparison with a 3090 also gets Region 1 correct but also the Range 5 with IO. https://github.com/HeyMeco/Rockchip-pcie-devices/blob/main/RK3568/_cards_gpu/Nvidia_RTX-3090-Founders-Edition.md
While the struggle on the Rockchip seems to be with Region 1 and 3 which work on the RPI. I do feel like we're very close to GPU support on SBC's though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants