Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong GPU info after upgrading from 0.12.0 to 0.16.0 #411

Open
atosahatos opened this issue Mar 31, 2025 · 2 comments
Open

Wrong GPU info after upgrading from 0.12.0 to 0.16.0 #411

atosahatos opened this issue Mar 31, 2025 · 2 comments

Comments

@atosahatos
Copy link

Hello,

I have the following code to get GPU info and print all detected cards:

func detectGPU() error {
	gpu, err := ghw.GPU()
	if err != nil {
		fmt.Printf("Error getting GPU info: %s\n", err)
		return err
	}
	for _, card := range gpu.GraphicsCards {
		fmt.Println("Card:", card)
	}
}

On the same hardware (containing 2 NVIDIA cards) with the different versions of GHW I got the following outputs.

0.13.0:
Card: card #0 [affined to NUMA node 0]@0000:01:00.1 -> driver: 'mgag200' class: 'Display controller' vendor: 'Matrox Electronics Systems Ltd.' product: 'MGA G200eH3'
Card: card #1 [affined to NUMA node 0]@0000:26:00.0 -> driver: 'nvidia' class: 'Display controller' vendor: 'NVIDIA Corporation' product: 'GA102GL [RTX A5000]'
Card: card #2 [affined to NUMA node 1]@0000:8a:00.0 -> driver: 'nvidia' class: 'Display controller' vendor: 'NVIDIA Corporation' product: 'GA102GL [RTX A5000]'

0.14.0:
Card: card #0 [affined to NUMA node 0]@0000:01:00.1 -> driver: 'mgag200' class: 'Display controller' vendor: 'Matrox Electronics Systems Ltd.' product: 'MGA G200eH3'
Card: card #1 [affined to NUMA node 0]@0000:25:01.0 -> driver: 'pcieport' class: 'Bridge' vendor: 'Intel Corporation' product: 'unknown'
Card: card #2 [affined to NUMA node 1]@0000:89:01.0 -> driver: 'pcieport' class: 'Bridge' vendor: 'Intel Corporation' product: 'unknown'

0.15.0 and 0.16.0:
Card: card #0 [affined to NUMA node 0]@0000:01:00.1 -> driver: 'mgag200' class: 'Display controller' vendor: 'Matrox Electronics Systems Ltd.' product: 'MGA G200eH3'
Card: card #1 [affined to NUMA node 0]@0000:26:00.0 -> driver: 'nvidia' class: 'Display controller' vendor: 'NVIDIA Corporation' product: 'GA102GL [RTX A5000]'
Card: card #2 [affined to NUMA node 1]@0000:89:01.0 -> driver: 'pcieport' class: 'Bridge' vendor: 'Intel Corporation' product: 'unknown'

It seems 0.13.0 returned the correct info.
In 0.14.0 the returned information is wrong for both cards.
In 0.15.0 and 0.16.0 GHW gives a mixed result for the two cards.

Thank you for any help / fix on this problem.

@jaypipes
Copy link
Owner

jaypipes commented Apr 2, 2025

Thank you for the bug report @atosahatos! I will look into this as soon as possible (currently out in London for KubeConEU so might be a bit before I can get to this)

Please confirm that you on a Linux system and if you wouldn't mind, please paste the output of the following:

lsb_release -a
uname -r
ls -al /sys/class/drm/card[0-2]

@atosahatos
Copy link
Author

atosahatos commented Apr 3, 2025

Yes, this is SLES 15.6:

lsb_release -a
LSB Version: n/a
Distributor ID: SUSE
Description: SUSE Linux Enterprise Server 15 SP6
Release: 15.6
Codename: n/a

uname -r
6.4.0-150600.23.38-default

ls -al /sys/class/drm/card[0-2]
lrwxrwxrwx 1 root root 0 Mar 26 12:22 /sys/class/drm/card0 -> ../../devices/pci0000:00/0000:00:0d.0/0000:01:00.1/drm/card0
lrwxrwxrwx 1 root root 0 Mar 26 12:22 /sys/class/drm/card1 -> ../../devices/pci0000:25/0000:25:01.0/0000:26:00.0/drm/card1
lrwxrwxrwx 1 root root 0 Mar 26 12:22 /sys/class/drm/card2 -> ../../devices/pci0000:89/0000:89:01.0/0000:8a:00.0/drm/card2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants