Question Prime95 error: "Segmentation fault (core dumped)" exists in FFT 224K and 240K only ?

testcb00

Honorable
Sep 27, 2016
16
0
10,510
I buy an old server recently and I try to use Prime95 to do a burn test.

However, I find that the server cannot do test in FFT 224K and FFT 240K. "Segmentation fault (core dumped)" will show after a few minutes and the Prime95 program crashed. The OS is fine and I can start a new Prime95 or other programs without problem.
View: https://i.imgur.com/TyAWJbP.png

View: https://i.imgur.com/VyX7jCP.png


If I use Smallest FFTs / Small FFTs / Large FFTs, the server can run 24 Hours and there are no fail.
It seems that those tests do not use the FFT 224K and FFT 240K.

Do the server have problems? or this is a BUG?

Server Details:
Supermicro X9SRG-F
Intel Xeon E5-2648Lv2 10C20T
256GB DDR3-1600 LRDIMM (Samsung M386B8G70DE0-CK03Q 64GB x4)
lubuntu 22.04 LTS
 
Last edited:

Colif

Win 11 Master
Moderator
Strange, I don't see any errors in the results.txt file (I looked through it twice)

https://www.baeldung.com/linux/segmentation-fault

segmentation" is the concept of each process on your computer having its own distinct virtual address space. Thus, when Process A reads memory location 0x877, it reads information residing at a different physical location in RAM than when Process B reads its own 0x877.
All modern operating systems support and use segmentation, and so all can produce a segmentation fault.
https://stackoverflow.com/questions/3200526/what-is-a-segmentation-fault-on-linux

is there another version of Prime you could use? It might be the software.

  • When a piece of code tries to do read and write operation in a read only location in memory or freed block of memory, it is known as core dump.
  • It is an error indicating memory corruption.

I wonder if you would get errors if you run Prime off a USB?
Prime 95 bootable - https://www.infopackets.com/news/10113/how-fix-bootable-prime95-stress-test-hardware
 
Last edited:

testcb00

Honorable
Sep 27, 2016
16
0
10,510
Strange, I don't see any errors in the results.txt file (I looked through it twice)

https://www.baeldung.com/linux/segmentation-fault


https://stackoverflow.com/questions/3200526/what-is-a-segmentation-fault-on-linux

is there another version of Prime you could use? It might be the software.

Actually I have reviewed the results.txt and the error do not exist in that file.
It seems that the program crashed so it cannot write to the log file.

I am using the p95v308b16.linux64.tar.gz
 

Karadjgne

Titan
Ambassador
What slots are the ram in? Should be the blue slots. That can make a difference to the memory controller especially since you are maxing out not only the ram size, but also socket size and speeds on 4x sticks instead of 8x.

Id give the dram a bump upto 1.55v and also give VCCSA and VCCIO a small voltage bump too by 0.05v-0.1v
 

testcb00

Honorable
Sep 27, 2016
16
0
10,510
Resulting sum error - is this overclocked? seems to happen mostly to unstable overclocks

Rounding errors can be fixed by raising voltages on ram in the bios.

I see rounding more often

No, server motherboard cannot overclock RAM.
I also change CPU to E5-2650Lv2 and get the same result.
May be the controller of LRDIMM have problem in this FFT range...
 

testcb00

Honorable
Sep 27, 2016
16
0
10,510
What slots are the ram in? Should be the blue slots. That can make a difference to the memory controller especially since you are maxing out not only the ram size, but also socket size and speeds on 4x sticks instead of 8x.

Id give the dram a bump upto 1.55v and also give VCCSA and VCCIO a small voltage bump too by 0.05v-0.1v
Sure, it is blue slot. I have read the manual.

I do not have idea to adjust the voltage in Supermicro Server Motherboard...
 
Last edited:

testcb00

Honorable
Sep 27, 2016
16
0
10,510
well, its not the CPU, so it could be the ram stick itself
the motherboard perhaps,
or the Power supply.

does reducing ram amount change it at all?

One DIMM do not have problem, but when the RAM increase to two or more DIMM (Quad, Tri, Dual Channel / Single Channel with two DIMM), the error will exist.
 

testcb00

Honorable
Sep 27, 2016
16
0
10,510
Update
I borrow some memory and make several combination for testing

It seems that the error exists only when the RAM > 80GB...

Prime95 AVX 224K/240K

Pass
LRDIMM
1x 64GB DDR3-1600 LRDIMM
RDIMM
1x 8GB DDR3-1600 RDIMM
4x 16GB DDR3-1600 RDIMM
5x 16GB DDR3-1600 RDIMM
4x 16GB DDR3-1600 RDIMM + 1x 8GB DDR3-1600 RDIMM

Fail
LRDIMM
2x 64GB DDR3-1600 LRDIMM
3x 64GB DDR3-1600 LRDIMM
4x 64GB DDR3-1600 LRDIMM
RDIMM
6x 16GB DDR3-1600 RDIMM
7x 16GB DDR3-1600 RDIMM
8x 16GB DDR3-1600 RDIMM
4x 16GB DDR3-1600 RDIMM + 3x 8GB DDR3-1600 RDIMM
4x 16GB DDR3-1600 RDIMM + 2x 8GB DDR3-1600 RDIMM