X11SPA-T Random memory issues on soft reset

Discussion in 'Supermicro' started by Sergey, Jul 7, 2020.

  1. Sergey

    Sergey

    Joined:
    Jul 7, 2020
    Messages:
    1
    Likes Received:
    0
    Hello. I have new X11SPA-T and it works fine, except one issue - sometime memory DIMMs are failing with message in IPMI: "Failing DIMM: DIMM location (Uncorrectable memory component found). (DIMMA2) - Assertion"

    When I enter system BIOS and change boot configuration (as simple as boot from USB Stick to boot from NVMe SSD, and wise versa), then after system reboot (after selecting save and reboot) randomly, some of the memory sticks are reported as "Failed, uncorrectable memory component found". After that I press power button, wait three seconds, and power system back - as a result all memory is ok. Most frequent failing DIMM location is DIMM C 1 & 2.
    BIOS is configured as UEFI.
    Memory runs at temperature ranges 30-45 (C) (40(C) for couple days under memtest86)
    I ran memtest86 multiple times and no instability or errors were found.
    I ran Prime95 for few hours and CPU/RAM temps were normal (about 60 (C) for CPU at heavy load, and 40 (C) for memory).

    If I don't enter BIOS, but reboot system from OS there are no issues.
    Changing memory locations/refitting does not result in other DIMM locations to report issues (for example moving DIMMs from C1/2 to F/1/2) still makes DIMMs C1/2 to randomly fail.
    This behavior was observed with BIOS 3.2 and 3.3 (I Updated to 3.3 to see if it will fix the issue).

    I tried to get help from supermicro, but they requested to "check CPU cooler" even though CPU temp is normal. Then refit CPU, even it runs without errors for day on prime95. Then they hinted that I should of bought memory from their store :(

    Memory is NEMIX MR23400-324 model (it said on amazon SuperMicro certified). All 12 DIMM's are the same. CPU is Xeon Gold 6248
     
    Last edited by a moderator: Jul 9, 2020
    Sergey, Jul 7, 2020
    #1
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.