Question 2 problems: hp 17-ca1004ng Linux detecting only one DIMM, boot hangs....

May 5, 2021
27
1
35
Hello

Ive upgraded my hp 17-ca1004ng from 1x4GB+1x8GB DIMMS to 1x8GB+1x16GB DIMMs (so basically 4G module swapped for 16G) and the BIOS shows me 24GB RAM however after Linux boots it shows only 16GB (minus 2GB for the GPU! so only 14GB usable...).

I wonder what the problem is, it was detecting both DIMMs in the 4+8 configuration (then 10GB was available because it always takes 2G for GPU).

Now the e820 map in dmesg shows only around 16G, but decode-dimms shows both DIMMs... weird!

[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000009ecffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000009ed0000-0x0000000009ffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000000a000000-0x000000000a1fffff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000a200000-0x000000000a20afff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x000000000a20b000-0x00000000da3abfff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000da3ac000-0x00000000da4d3fff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000da4d4000-0x00000000da514fff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x00000000da515000-0x00000000daaaafff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x00000000daaab000-0x00000000ddb14fff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000ddb15000-0x00000000deffffff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000df000000-0x00000000dfffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fd000000-0x00000000ffffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000039effffff] usable
[ 0.000000] BIOS-e820: [mem 0x000000039f000000-0x000000041effffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000041f000000-0x000000041f33ffff] usable
[ 0.000000] BIOS-e820: [mem 0x000000041f340000-0x000000041fffffff] reserved
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] e820: update [mem 0x9d596018-0x9d5a4057] usable ==> usable
[ 0.000000] e820: update [mem 0x9d596018-0x9d5a4057] usable ==> usable
[ 0.000000] e820: update [mem 0x9d588018-0x9d595457] usable ==> usable
[ 0.000000] e820: update [mem 0x9d588018-0x9d595457] usable ==> usable

add_efi_memmap has no effect, also not passing e.g. mem=20G or mem=24G. Not detected.

Ive finally updated the bios to latest and the laptop got even worse: memory still not detected and it boots only ONE time, e.g.

it boots, linux comes up, mem not detected => reboot again, Grub shows up and now on the second boot it hangs after the initrd message with something like "ACPI symbol xxxxx not found (BIOS bug)", something alike, and the system hangs completely. It will not reboot again, not even after power off,on cycle.
Then if I power cycle it again, press ESC, enter BIOS, but change NOTHING, just go into SAVE and EXIT, then it will again boot into grub and then boot Linux again without the BIOS bug message. (it is not a bootorder etc problem because Grub always shows up).

Any idea how to fix both issues? it is an absolute crap!
 
May 5, 2021
27
1
35
The problem apparently show up here in dmidecode:

for DIMM1 it shows Range size = 0

Handle 0x000F, DMI type 18, 23 bytes
32-bit Memory Error Information
Type: OK
Granularity: Unknown
Operation: Unknown
Vendor Syndrome: Unknown
Memory Array Address: Unknown
Device Address: Unknown
Resolution: Unknown

Handle 0x0010, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0009
Error Information Handle: 0x000F
Total Width: 64 bits
Data Width: 64 bits
Size: 8192 MB
Form Factor: SODIMM
Set: None
Locator: Bottom - Slot 1 (left)
Bank Locator: P0 CHANNEL A
Type: DDR4
Type Detail: Synchronous Unbuffered (Unregistered)
Speed: 2133 MT/s
Manufacturer: Samsung
Serial Number: 98178874
Asset Tag: Not Specified
Part Number: M471A1G43DB0-CPB
Rank: 2
Configured Memory Speed: 2400 MT/s
Minimum Voltage: 1.2 V
Maximum Voltage: 1.2 V
Configured Voltage: 1.2 V

Handle 0x0011, DMI type 20, 35 bytes
Memory Device Mapped Address
Starting Address: 0x0000000000000000k
Ending Address: 0xFFFFFFFFFFFFFFFFk
Range Size: 0 bytes
Physical Device Handle: 0x0010
Memory Array Mapped Address Handle: 0x000A
Partition Row Position: 1


And for DIMM2 the correct size:

Handle 0x0012, DMI type 18, 23 bytes
32-bit Memory Error Information
Type: OK
Granularity: Unknown
Operation: Unknown
Vendor Syndrome: Unknown
Memory Array Address: Unknown
Device Address: Unknown
Resolution: Unknown

Handle 0x0013, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0009
Error Information Handle: 0x0012
Total Width: 64 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: SODIMM
Set: None
Locator: Bottom - Slot 2 (right)
Bank Locator: P0 CHANNEL B
Type: DDR4
Type Detail: Synchronous Unbuffered (Unregistered)
Speed: 2400 MT/s
Manufacturer: Unknown
Serial Number: E0D1B7B9
Asset Tag: Not Specified
Part Number: CT16G4SFD824A.M16FE
Rank: 2
Configured Memory Speed: 2400 MT/s
Minimum Voltage: 1.2 V
Maximum Voltage: 1.2 V
Configured Voltage: 1.2 V

Handle 0x0014, DMI type 20, 35 bytes
Memory Device Mapped Address
Starting Address: 0x00000000000
Ending Address: 0x003FFFFFFFF
Range Size: 16 GB
Physical Device Handle: 0x0013
Memory Array Mapped Address Handle: 0x000A
Partition Row Position: 1


Why would the range size be 0?

The 8GB module was working in combination with 4GB module and it is in the same slot, only 4GB was swapped for 16GB.

Is this a BIOS bug?

BIOS shows correctly 24GB of RAM
 
May 5, 2021
27
1
35
Distribution doesn't matter really. Kernel is 5.3.18-lp152.72-preempt, there is no limitation to some particular RAM size by distro/kernel. I assume you did not understand completely my followup about dmidecode?

htop is not relevant here, it will always only show the amount detected by the kernel.

But I will try with memtest too.

Looks to me like some BIOS bug, the range size is 0
 
May 5, 2021
27
1
35
it must be a BIOS bug, I've also found now it has the 24GB in the dmidecode:

Handle 0x000A, DMI type 19, 31 bytes
Memory Array Mapped Address
Starting Address: 0x00000000000
Ending Address: 0x005FFFFFFFF
Range Size: 24 GB
Physical Array Handle: 0x0009
Partition Width: 2

so there is a total memory handle plus a handle for each of these DIMMs, but the 8GB DIMM handle has a size of 0, doesn't make lot of sense to me. If one of them won't be detected then I doubt the total memory handle would show 24GB.