Please Help! - system crashes (only) when booting to updated kernel

Discussion in 'Dell' started by ZZYZX, May 24, 2006.

  1. ZZYZX

    ZZYZX Guest

    Hi,

    I have never had to go to the extreme of posting on Usenet to solve a
    problem, but I have been struggling with this for weeks, and no other
    posts have helped me. In any case, here is the relevant system
    information:

    System: Dell PowerEdge 600SC
    RAID Card: LSI Logic CERC ATA/4ch RAID
    (Series 511 REV C2, BIOS Ver. 1.03)
    RAID Level: 5
    Logical Drives: 1
    Correct Driver: megaraid
    Detected Driver: megaraid_mbox

    Filesystem setup: (IDE RAID card shows logical drives as SCSI)
    /dev/sda1: /boot
    /dev/sda2: /data
    /dev/sda3: /
    /dev/sda4: SWAP

    I am trying to get any Fedora above FC2 working on it, and have tried
    FC3, FC4, and FC5. The problem is that both FC3 and FC4 detect the
    wrong card during installation, and that driver doesn't allow the setup
    program to access the logical drive. FC5 doesn't even offer the driver
    when the "noprobe" option is used, so I haven't made that work at all.
    I can get FC3/FC4 installed as long as I use the "noprobe" option and
    manually enter the correct driver. For some reason, both FC3 and FC4
    identify the RAID card as (megaraid_mbox), which causes the
    installation to fail when it comes time to set up partitions. As long
    as I use "linux noprobe" and specify the (megaraid) driver instead of
    (megaraid_mbox) at the beginning of the installation, everything goes
    like clockwork. That is, until I try to update the thing.

    I have been using YUM to update linux lately, but I have also tried
    up2date on this system, with the same results. I suspect RPM would
    have the same problem. As soon as any new kernel package is installed,
    I cannot boot the system to the new kernel. If I select the old kernel
    from the grub menu, I can boot the system just fine as before, but of
    course, that means still using the old kernel.

    My guess about what is happening is that something in the installation
    program for the new kernel is causing the wrong driver, probably
    (megaraid_mbox) instead of (megaraid) to be set up as the hard disk
    driver for boot. If I set the system up on a normal IDE drive,
    connected to the system board instead of the RAID card, the updates run
    flawlessly, so I know it's a problem between this RAID card and Fedora.
    Here is the error message I get after installing a new kernel and
    trying to boot to that kernel:

    Uncompressing Linux... OK, booting the kernel.
    Red Hat nash version 4.2.15 starting
    mkrootdev: label / not found
    mount: error 2 mounting ext3
    ERROR opening /dev/console!!!!:2
    error dup2'ing fd of 0 to 0
    error dup2'ing fd of 0 to 1
    error dup2'ing fd of 0 to 2
    switchroot: mount failed: 22
    Kernel panic - not syncing: Attempted to kill init!

    Some of the things I have already tried before installing new kernel
    are:

    Resetting each partition name using mklabel
    Reformatting SWAP filesystem with version 2 type swap
    Remaking initrd images (though not positive I did it right)
    Changing reference to "megaraid_mbox" to "megaraid" in
    modprobe.conf & hwconf
    Renaming megaraid_mbox.ko and copying megaraid.ko to that filename
    Installing everything (except SWAP) on one partition
    Copying old kernel config and initrd files to new name of new
    kernel files

    None of those worked. So, needless to say, I need a lot of help. I'm
    all out of ideas, so I'm asking the gurus out there. Hopefully someone
    can help me. I think the most important point is that the new kernel
    installation is what is bad, somehow not incorporating the correct
    driver, since the original installation remains good, even after the
    new kernel is installed. I'm hoping it's something simple, but *any*
    help would be greatly appreciated. I will answer any further questions
    about my configuration if asked. This thing is driving me crazy!

    Thanks!!!
     
    ZZYZX, May 24, 2006
    #1
    1. Advertisements

  2. Have you considered just recompiling the kernel yourself? You can use
    the kernel source patched by Red Hat (see the Fedora Release Notes on
    how to extract the kernel source from the SRPM) or you can use a vanilla
    one from kernel.org. If you know the hardware well, this should be the
    best option. If you must use the precompiled kernels in the yum
    repository, then could you post the contents of your /boot/grub/grub.conf.
     
    Nicholas Andrade, May 24, 2006
    #2
    1. Advertisements

  3. ZZYZX

    ZZYZX Guest

    Recompiling the linux kernel is something I need to learn how to do at
    some point, since I've never done it and have been playing with the OS
    for years now. I'd rather do it under better circumstances, though.
    Here is the info you asked for (the grub.conf file). I included even
    the standard commented-out stuff at the beginning of the file, and I've
    listed it for two configurations. The first one is the grub.conf that
    I had originally, with the 4-partition setup I talked about in my
    original post. The second one is the one I currently have, since the
    last installation was the one where I put everything except SWAP on one
    partition (no /boot). They're both the same in that the second option
    (the original kernel) works if selected, but the new kernel (selected
    by default) does not.

    ------
    First grub.conf - from this partition setup:
    sda1 /boot
    sda2 /data
    sda3 /
    sda4 (swap)
    ------
    # grub.conf generated by anaconda
    #
    # Note that you do not have to rerun grub after making changes to this
    file
    # NOTICE You have a /boot partition. This means that
    # all kernel and initrd paths are relative to /boot/, e.g.
    # root (hd0,0)
    # kernel /vmlinux-version ro root=/dev/sda3
    # initrd /initrd-version.img
    # boot=/dev/sda
    default=0
    timeout=5
    splashimage=(hd0,0)/grub/splash.xpm.gz
    hiddenmenu
    title Fedora Core (2.6.16-1.2096_FC4)
    root (hd0,0)
    kernel /vmlinuz-2.6.16-1.2096_FC4 ro root=LABEL=/ rhgb quiet
    initrd /vmlinuz-2.6.16-1.2096_FC4.img
    title Fedora Core (2.6.11-1.1369_FC4)
    root (hd0,0)
    kernel /vmlinuz-2.6.11-1.1369_FC4 ro root=LABEL=/ rhgb quiet
    initrd /vmlinuz-2.6.11-1.1369_FC4.img

    (NOTE: Choosing option #2 boots, choosing option #1 gives the error
    from the initial post)
    ---------------------------------------
    This config is the one where I put most of the system on one partition.
    The last install I did had a /boot partition, so the paths were
    different. Also, that version of the new kernel was 2.6.16-1.2096_FC4
    instead of the newer 2.6.16-1.2108 from this config.
    ------
    Second grub.conf - from this partition setup:
    sda1 (blank)
    sda2 /mnt/data
    sda3 /
    sda4 (swap)
    ------
    # grub.conf generated by anaconda
    #
    # Note that you do not have to rerun grub after making changes to this
    file
    # NOTICE You do not have a /boot partition. This means that
    # all kernel and initrd paths are relative to /, e.g.
    # root (hd0,2)
    # kernel /boot/vmlinux-version ro root=/dev/sda3
    # initrd /boot/initrd-version.img
    # boot=/dev/sda
    default=0
    timeout=5
    splashimage=(hd0,2)/boot/grub/splash.xpm.gz
    hiddenmenu
    title Fedora Core (2.6.16-1.2108_FC4)
    root (hd0,2)
    kernel /boot/vmlinuz-2.6.16-1.2108_FC4 ro root=LABEL=/ rhgb quiet
    initrd /boot/vmlinuz-2.6.16-1.2108_FC4.img
    title Fedora Core (2.6.11-1.1369_FC4)
    root (hd0,2)
    kernel /boot/vmlinuz-2.6.11-1.1369_FC4 ro root=LABEL=/ rhgb quiet
    initrd /boot/vmlinuz-2.6.11-1.1369_FC4.img

    (NOTE: The exact same problem as with the other configuration -
    choosing option #2 boots, choosing option #1 gives the error from the
    initial post)
    ------
    It doesn't seem to matter whether the /boot partition exists nor what
    kernel version is installed. The initial installation program must
    have done to the system configuration that none of the update programs
    do. With this in mind, I don't even know if a compiled kernel would do
    anything. Does the installation CD compile the kernel or does it stick
    a standard kernel in there. I'm sure the problem can be found by
    looking at the differences between a CD install (using "linux noprobe")
    and a yum/up2date/rpm kernel update.

    Thanks again!!!
     
    ZZYZX, May 25, 2006
    #3
  4. In comp.os.linux.misc ZZYZX <>:

    [ problems with Fedora Core on Dell PowerEdge ]
    This looks like a server. I'd suggest to try out RHEL 4 (update 3
    is current (iirc)) or one of the clones like CentOS if you don't
    need/want the support from RH and see if this works better. Be
    sure to download the latest update CD/DVD and run 'yum update'
    after installing or 'up2date -u' if you go for RHEL.

    [stuff]
     
    Michael Heiming, May 25, 2006
    #4
  5. ZZYZX

    ZZYZX Guest

    Thanks. Luckily, I do have a copy of RHEL V4. It's not licensed to me
    so I can't use it forever, but it will be at least a month before my
    friend (the license owner) will install it on his server. I'll get his
    permission and attempt to install it and let you all know. I'll
    install it using the 4-partition setup that I originally had on my
    Fedora system, that is to say:

    sda1 = /boot
    sda2 = /data
    sda3 = /
    sda4 = (swap)

    I'll get that going over the next few days, provided someone doesn't
    suggest a better/easier solution by then. Even if I can get RHES to
    work, I'll still have to figure out how it did it (and how to make
    Fedora do it), as I will only be borrowing the copy of RHES. I don't
    quite have the cash to justify spending $700 (for my own copy of RHEL)
    even if it does work. I have a copy of Win2000 Server that belongs to
    me, but I'd much rather do it with Linux!

    Thanks again!
    ZZ
     
    ZZYZX, May 25, 2006
    #5
  6. Honestly, recompiling the kernel is trivial, and I still believe it's
    the best alternative.

    To obtain the source:
    http://www.mjmwired.net/resources/mjm-fedora-fc5.html#kernelsrc

    Now move it from from /usr/src/redhat/BUILD/kernel-2.6.16/linux-2.6.16
    to /usr/src/linux-2.6.16 (I recommend creating a symbollic link
    /usr/src/linux which points to /usr/src/linux-2.6.16). Copy the old,
    working /boot/config-2.6.11-1.1369_FC4 to /usr/src/linux/.config (note
    the dot before config). Now follow the following instructions to
    recompile the kernel:
    http://www.digitalhermit.com/linux/Kernel-Build-HOWTO.html

    Basically all you do is run: make oldconfig (just take the defaults);
    make xconfig (remove stuff you don't need, esp. that troubling RAID
    module that Fedora confuses for yours); make bzImage; make modules; make
    modules_install

    The reason I asked for your grub.conf was because I was curious if there
    was some odd kernel parameter that was being passed with the older
    kernel but omitted in the newer, obviously that's not the case. If you
    still don't want to recompile the kernel, then could you repost your
    original message (for some odd reason my news server has already expired
    it) or at least a link to it in google groups.
     
    Nicholas Andrade, May 25, 2006
    #6
  7. ZZYZX

    ZZYZX Guest

    OK, I'll get going with that plan over the next few days. If I have
    any problems I suppose I can ask you folks:)

    I did try installing RH Enterprise Server, and - get this - it detected
    my RAID card without my having to use the "linux noprobe" option. That
    is to say, it detected my card *correctly* with no more problems. I'm
    fairly confident that it would have applied updates correctly as well.
    I guess that's the difference $700 makes. Since I don't have that, and
    I refuse to use Windows when I have the option, I'll start compiling.

    Thanks for all the help. I'll post again when I get it working

    ZZYZX
     
    ZZYZX, May 26, 2006
    #7
  8. You just need to use CentOS as already pointed out to you, it is
    a *free* clone of RHEL. Sorry but I sometimes have the feeling
    people don't really read what others write? Or is it to difficult
    to type "CentOS" into google, it's really great if used for what
    it was made for.

    It is in addition beyond me why you use groups.google which is
    the worst interface to write on usenet, when your ISP seems to
    have his own nntp server. At least news.charter.com resolves
    quite fine.

    Anyway, perhaps I'm just getting old? ;-)

    Good luck

    BTW
    Please remember to quote context, this is usenet *not* some
    groups.google forum.
     
    Michael Heiming, May 26, 2006
    #8
  9. ZZYZX

    ZZYZX Guest


    Sorry if it offended you, but I used Google instead of Charter's nntp
    server for practical reasons. This the first Usenet post I've made in
    years, and possibly my last for years. Since I already had a Gmail
    accout, it was quicker than calling Charter to change my ISP password,
    which someone else at my location set up before leaving the company.
    For my purposes, a shitty interface is preferable to waiting 45 minutes
    for a tech. Maybe you would wait for the tech, I dunno.

    If I decide to post here a lot, I'll go back to NNTP.

    ZZ
     
    ZZYZX, May 27, 2006
    #9
  10. Strong point!

    Not a real issue, to get back to your problem, you can turn RHEL
    into CentOS without reboot, there should be instructions on
    howto do this task on centos.org. Just be sure to run 'yum
    update' after doing so.

    Good luck
     
    Michael Heiming, May 27, 2006
    #10
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.