Strange stability issue with K8WE and 16GB RAM

Discussion in 'Tyan' started by J&SB, Apr 1, 2007.

  1. J&SB

    J&SB Guest

    My K8WE (2895) with dual Opteron 280's and 8GB RAM (in the form of four 2GB
    DIMMs of Crucial PC-3200, 2 DIMMs per processor) has been running flawlessly
    for nearly 18 months now under Windows XP-64. Just last week I fulfilled a
    need to fill out the 4 vacant RAM slots with identical DIMMs to bring the
    total to 16GB. Upon booting up after the installation, everything was
    recognized and ran just fine.

    Following the installation, I wanted to do a bit more testing to see if my
    performance had drooped any, and so I ran the Sandra 2007 memory bandwidth
    test. The test completed, showing a 10300MB/s performance, but triggered a
    "soft" Machine Check Exception along the way (e.g. indicating that a
    correctable error was encountered along the way.) I then ran 4 simultaneous
    instances of Prime95, and continued to get several more of these Machine
    Check Exceptions throughout 4 hours of running, but no errors indicated by
    Prime95. Then, out of the blue, I encountered a fatal Machine Check
    Exception that triggered an automatic reboot. Following this, I ran
    Memtest-86 for 24 straight hours without any error whatsoever, so it's
    difficult to believe there is a problem with the RAM. Nevertheless, the
    "soft" Machine Check Exceptions continue to occur whenever I run the Sandra
    2007 memory bandwidth benchmark or Prime95.

    Now the interesting thing is that I subsequently discovered the following
    remedy (I'll skip over a lot of trial and error to get at the repeatable
    consequence of my testing): Using CrystalCPUID, I change the CPU multiplier
    from 12x to 11.5x and the CPU voltage from 1.35v to 1.40v, and then
    immediately change both of these values back to the default 12x and 1.35v
    respectively. After performing this trivial exercise, which in principle
    changes nothing, I observe absolutely rock-solid stability with any Sandra
    2007 benchmark and hours of running of 4 Prime95 instances. What gives?
    Obviously, this procedure of toggling back-and-forth the multiplier and
    voltage has some positive effect, but I don't understand it. Has anybody
    seen behavior like this?
    J&SB, Apr 1, 2007
  2. J&SB

    J. Hinkey Guest

    I have not seen this exact behavior (toggling CPU voltages or multipliers to fix things), but I have seen/experienced instances where the BIOS
    says a setting is set to a certain state or value when the machine was acting like it was set to something else. Toggling it or re-setting it
    has changed things (usually for the better). This is kind of scary since you don't know what settings are actually active on your computer!
    It may have been that your BIOS was corrupt and the toggling set things right. Re-flash the BIOS? Me, I'd just leave it alone for now.

    - John
    J. Hinkey, Apr 3, 2007
