1. This forum section is a read-only archive which contains old newsgroup posts. If you wish to post a query, please do so in one of our main forum sections (here). This way you will get a faster, better response from the members on Motherboard Point.

Bad memory corrupting disk content? was: Considerations in buyingused memory or hard disk

Discussion in 'Laptops' started by Dubious Dude, Feb 18, 2006.

  1. Dubious Dude

    Dubious Dude Guest

    By the way, Bob, when your memory started to get errors after a year,
    do you know whether it managed to corrupt a lot of the material you
    were working with and write it back to disk long before you became
    aware of these errors? If bad memory causes an immediate crash, then
    that's an inconvenience. It might be tolerable, in which case one can
    buy cheap memory and simply replace them when they start to die. If
    bad memory causing crashes, however, and I find that the disk contents
    were being corrupted well before I became aware of the problem, that
    is a different matter. I haven't heard much about this problem, but
    then, I haven't heard much about bad memory either. Thanks (to you,
    or anyone willing to chime in on this).
     
    Dubious Dude, Feb 18, 2006
    #1
    1. Advertisements

  2. I service laptops, and I really don't agree with some of the ideas
    suggested. I specialize in older Toshiba laptops from the 1998 to 2002
    era, and I see very few bad memory modules, even modules that are 3 to
    10 years old (SDRAM and EDO). You can test laptop memory just like
    desktop memory with a program such as memtest or memtest-86, and while
    I'm certainly not saying that memory is never bad or never fails, if
    it's good, in my experience it will generally be trouble free for years.

    However, anything being written out to the disk drive first exists in
    memory, and probably gets moved around in memory quite a bit. So it is
    definitely possible for bad memory to cause corruption in disk files
    that were written to the disk. And this applies to everything, because
    even if you copy "directly" from, say, a CD to the hard drive (say the
    cabinet files for Windows itself) the data still passes through buffers
    in the computer's memory.

    But if the memory is bad enough that much of it is bad (vs. one single
    location), then the comptuer WILL crash (lock up, etc.) very quickly.
    Probably before the OS can even boot.
     
    Barry Watzman, Feb 18, 2006
    #2
    1. Advertisements

  3. Dubious Dude

    Dubious Dude Guest

    Barry, I appreciate your sharing of your experience. It is pretty
    helpful in obtaining a well-rounded view. My situation might be a bit
    different because I don't use my laptop for leisure or recreation.
    Even the odd bit flip can cause an entire compressed archive of
    several hundred megabytes to be unusable. These archives are usually
    the safety net when everything else go wrong, so there's limited value
    in having them unless there is a high level of confidence in their
    integrity. The same applies to any files that represent a great deal
    of time invested. I understand that for most recreational or leisure
    use, this can be not all that important, since apps can be reinstalled
    and media files can often be obtained again, possibly at moderate
    cost. I'm inquirng because I have experience with situations which
    don't fall into those categories, and the consequences of the
    occassional bit flip can linger for years. I also don't have an
    alternate system at the moment where I can make frequent backups,
    though I'm working on that. However, even that system can be
    corrupted over time if the cause for that is corruption is at low
    enough levels that it can be undetected for some time. In fact, it is
    far better for failure to happen abruptly and in a big way so as to
    cause a crash on booth up. This minimizes the likelihood of it
    continuing to corrupt the backup undetected. It is the widely spaced
    corruption in memory (the "one location" that you mentioned) that is
    more dangerous, since it kills your safety net.
     
    Dubious Dude, Feb 18, 2006
    #3
  4. Well, we are in total agreement on this one point: Any given memory
    module is either totally, 100.0% reliable, or it's totally, 100.0% useless.

    That said, you can get a very infrequent occasional error from a
    perfectly good module. "Cosmic rays" (normal background nuclear
    radiation that is always present, everywhere) can "flip" a bit in a
    DRAM, and it doesn't mean that there is anything wrong with the DRAM
    (it's statistically uncommon, but not so uncommon that it never happens
    in the real world).

    Excluding this, however, a module is either good or it's bad. Any bad
    is totally bad. My point is, you can test any given memory module, and
    if it is good, then I don't think that where it came from (e.g. E-Bay
    vs. Crucial, Mushkin or anywhere else) matters one iota in terms of what
    that memory module is likely to do in the future as regards remaining
    good or going bad.

    My only other comment is that if you have only one copy of crucial data,
    then you are living dangerously, no matter what type of device or media
    is storing that copy. Even two copies is not safe enough.
     
    Barry Watzman, Feb 18, 2006
    #4
  5. Dubious Dude

    Dubious Dude Guest

    Yes, fortunately, we live well below the atmospheric level where that
    is significant. There are schemes in which everything is triple
    redundant, and best 2 out of 3 determines the right bit value (since
    the chances of 2 out of 3 being corrupted is neglegible).
    Actually, I'm not too concerned about where it comes from so much as
    the lifetime before seeing corruption. I understand that in your
    experience, memory doesn't go bad much once tested to be problem free.
    I was also wondering how long a single defect in memory can exist
    before making itself know e.g. through crashing. This is the window
    of time in which the memory can corrupt disk content. If it is a long
    duration, it is possible that users might not be aware of the problem,
    attributing the occassional crash to something else.
    Yes, I agree 100%. I have been pretty careful about making regular
    backups, but as I mentioned, I am also working towards making it
    easier to do so more frequently. If there is a significant window of
    time for bad memory to corrupt disk content before making itself
    known, however, then even backups are at risk. If I understand you
    right, you don't think this is likely within the useful lifetime of a
    laptop if the memory tests properly upon receipt.
     
    Dubious Dude, Feb 20, 2006
    #5
  6. Dubious Dude

    Fman Guest

    Sorry for the repost, but this didn't seem to have made it.
    Here it is again (with edits).

    Yes, fortunately, we live well below the atmospheric level where that
    is significant. There are schemes in which everything is triple
    redundant, and best 2 out of 3 determines the right bit value (since
    the chances of 2 out of 3 being corrupted is neglegible).
    Actually, I'm not too concerned about where it comes from so much as
    the lifetime before seeing corruption. I understand that in your
    experience, memory doesn't go bad much once tested to be problem free.

    If an isolated problem did develop, however, I was also wondering how
    long a single defect in memory can typically exist before making
    itself known e.g. through crashing. This is the window of time in
    which the memory can corrupt disk content. If it is a long duration,
    it is possible that users might not be aware of the problem,
    attributing the occassional crash to something else. What is the
    likelihood that memory going bad is simply not being recognized as
    such?
    Yes, I agree 100%. I have been pretty careful about making regular
    backups, but as I mentioned, I am also working towards making it
    easier to do so more frequently. If there is a significant window of
    time for bad memory to corrupt disk content before making itself
    known, however, then even backups are at risk. If I understand you
    right, you don't think this is likely within the useful lifetime of a
    laptop if the memory tests properly upon receipt.
     
    Fman, Feb 20, 2006
    #6
  7. Dubious Dude

    J. Clarke Guest

    You live in the bottom of a coal mine?

    Cosmic rays don't go all the way through the atmosphere, instead what
    happens is worse--a cosmic ray hits an oxygen or nitrogen or some other
    molecule and imparts so much energy to the atom that the atom fissions and
    creates a cascade of secondaries, all of which have a great deal of energy
    although not as much as the initial cosmic ray. Those secondaries
    sometimes have enough juice to create tertiaries and the result is that
    instead of there being one particle there's a whole shower of them.

    It happens rarely, but it does happen.
    It may go for decades if it's in a location that seldom contains program
    code. I've had systems with single-bit errors that were completely stable
    but corrupted data right and left.
    Very, very high. In my experience most of the problems that users attribute
    to the poor quality of Microsoft code turn out to be memory. Most techs
    don't think to swap out the memory on a flaky machine.
    For the use that you are describing you really need to look into two things.

    The first is a machine that can use ECC memory--this won't get all bit
    errors but it will correct all single-bit errors and detect most multibit
    errors. Unfortunately this is difficult to find in notebook computers--you
    may need to look into a "lunchbox" machine.

    The second is taking measures to preserve the integrity of your archives in
    the face of unreliable storage. This doesn't mean more backups, it means
    storing them in such a manner that minor corruption of the files will not
    cause problems. The most accessible way to do this would be to use PAR2
    files. If you know programming you can generate your own ECC codes with
    whatever capabilities you want. Either of these requires additional
    storage though, which may defeat the purpose of having a compressed archive
    depending on what degree of compression you achieve.
     
    J. Clarke, Feb 20, 2006
    #7
  8. Dubious Dude

    Dubious Dude Guest

    Yes, well triple redundancy is used for single-event upsets. If
    you're going to have a large enough cascade of disruptions that are
    large enough to flip 2 or 3 of the triply redundant bits, then
    measures against that for the end user on earth are a bit out of
    reach. I think we both agree that worrying about that is not
    worthwhile.
    Yes, you're right. It's not one of the alternatives I'm choosing.
    The question I was investigating was more the probabilistic
    distribution of lifetimes for laptop memories (of reputable brand
    names), in order to understand whether there was a trade-off in buying
    used memory. I was at first thinking of typical graphs of
    monotonically increasing number of bad bits (in the memory, not
    induced by high energy particles) with years of service. Maybe an
    average graph, or a collection of specific samples. Of course, that
    info is hard to find, so I was just trying to get an idea based on
    people's experiences.

    After hearing the different views on good memory going bad, though,
    and how hard it can be to recognize, I guess it's reasonable to define
    the lifetime as the duration until it's first bad bit. If the
    distribution of liftimes is shaped (simplistically) gaussian hump,
    peaking at 10 years, but with some visible nonzero tail at 5 years,
    then it's probably not a good idea to buy memory used. In the best
    case, feedback on this thread would have indicated that this is never
    an issue, but it really seems to be mixed.

    Given that, I thought it would be realistic to go with used memory,
    which might go bad sooner, as long as it did so catastrophically,
    without hiding in the system and corrupting your backups. At least if
    it went bad and crashed the system right away, you can fall back on
    clean backups. According to you, there are nonneglegible numbers of
    incidences where this isn't the case. This implies that buying new
    memory is the best way to avoid the scenario all together.
    Yes, the issue of backing up is a separate To-Do which I've been
    intending to look into. Again, if corrupt memory can corrupt backups,
    it seems new memory is the way to go, unless one is dealing with
    recreational or leisure material (pictures and music can still retain
    much of their personal value in the presence of a tiny bit of
    corruption), reinstallable applications, or material that can be
    easily re-obtained via the web.
     
    Dubious Dude, Feb 22, 2006
    #8
  9. Dubious Dude

    J. Clarke Guest

    You're really looking at third or fourth order effects here I think. The
    difference between "new" and "used" memory in terms of error rate is IMO
    going to be less than the individual variation between units. Personally
    other than static damage during handling I've never seen memory "go bad"
    and never heard of it "going bad". It's either bad at the beginning or not
    at all. The trouble is that bad at the beginning doesn't always get
    detected, even by memory test programs. Had one machine way back that one
    could run memory test programs on for a month with no problems detected,
    but run a particular application and it gave a parity error every time. In
    that case the error was dependent on the prior state of the memory and the
    test programs weren't going through the right sequence to hit it.
    Fortunately that was back in the era before parity checking was regarded as
    a luxury.

    My concerns with "used memory" would be more in the line of why it was
    available used (i.e. was it causing problems in the machine from which it
    was removed) and whether it had been handled properly during removal and
    packaging than anything to do with its age.
     
    J. Clarke, Feb 23, 2006
    #9
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.