The other way

My PC has lots of problems, but I’m the kind of person that can’t be bothered to find out what or where the problem lies, simply because hardware is too difficult to dig through.

I’m quite convinced that the problem is the memory, because whenever I do big compile jobs I end up with these crazy stories:

internal compiler error: Segmentation fault

Don’t be fooled it happens regularly, so I’ve come up with a fail proof method of compiling a big application:

make; make; make; make; make; make; make;

I’m quite proud of it.

Yeah, I also can’t be bothered to run memcheck because it infuriates me that I have no idea how long there is left of the test.

16 Comments

  1. Anders
    Posted June 21, 2006 at 4:37 pm | Permalink

    Had the exact same problem. It was a heat problem. gcc/g++ really pushes the cpu. I put some cooling paste between my cpu and the heatsink and the problem went away. Maybe a larger heatsink or better fan will do for you.

  2. chani
    Posted June 21, 2006 at 4:45 pm | Permalink

    can’t be bothered to return the ram and get your money back, either? :)
    every time I buy a bunch of hardware, one thing is always defective… :( it drives me insane.

  3. Posted June 21, 2006 at 4:58 pm | Permalink

    My warranty expired, it’s not the newest machine!

  4. Jos
    Posted June 21, 2006 at 5:41 pm | Permalink

    You can improve on that:
    make||make||make||make||make||make

  5. Posted June 21, 2006 at 5:42 pm | Permalink

    try “nice=19 make” instead?

    also, I had to troubleshoot something similar with my own computer, which was regularly overheating.

    it turns out that whenever you change your PCI cards (or whatever they’re called these days), you should make sure to fill in any of the back-plates that are left open. the fan on the CPU depends on fresh air being circulated, which will not happen if there is more than one way for the air to get out (on my machine, the air gets in at the bottom of the front of the case, and gets out at the top of the back, just under the PSU). once I did that, most problems vanished immediately.

  6. Posted June 21, 2006 at 5:43 pm | Permalink

    Unfortunately, you can’t run this script when you work with unsermake. It needs to be rewritten entirely then.

  7. Posted June 21, 2006 at 5:45 pm | Permalink

    haha, I suppose I should put the side cover of my box back on then? Maybe after I clear the dust off it!

  8. Lee
    Posted June 21, 2006 at 6:07 pm | Permalink

    Try resetting your bios to default settings. If that fails, look for the memory timings, and put them to the highest numbers possible. It should push the memory less, and therefore be more reliable, if the problem IS memory.

    You might want to read the SIG-11 FAQ.

  9. Brad Hards
    Posted June 21, 2006 at 7:00 pm | Permalink

    Do you need some new hardware?

  10. Thomas vK
    Posted June 21, 2006 at 8:46 pm | Permalink

    Perhaps a bit useless to say this also but a year ago I had my first shot at Gentoo so I had a lot to compile. I constantly got Segmentation Faults and blamed it on Gentoo (was the easiest way to go, hehe). But a friend of mine noted that my conclusion was wrong and I should look at my hardware. When my PC also regularly crashed while playing computer games and stuff like that I went inspecting more. I installed and configured lm_sensors and then saw that my CPU constantly was getting very hot (as much as 80°C while with my CPU it should never be more than 72!). I bought some cooling hardware (in my case a new CPU fan, a new case) and had some techie install it for me in the optimal way. After that it was still pretty hot but not so much to have it crash.

    Point is… you could look at your temperatures. :-) Either with software like lm_sensors but you could also reboot and go into the BIOS right after a Segmentation Fault (or after manually loading the CPU with software like burnCPU) and see the page Health Status. Normally you can see temperatures there. If they seem very high you need to fix that. :-)

    In any case: good luck and I hope you can beat this one!

  11. Aron Stansvik
    Posted June 21, 2006 at 9:16 pm | Permalink

    I’ve had this problem and it was bad RAM. memtest continues indefinately so if you’re waiting for it to finish you will get gray hairs ;) It tests the memory in several passes, wait a couple of passes and if it isn’t finding any errors you can be pretty sure it’s not the RAM.

  12. superstoned
    Posted June 21, 2006 at 9:24 pm | Permalink

    Cooling might help very well, nice -n 19 won’t… After all, nice just changes the priority, so other tasks might go first, and your system becomes more responsive. But on an idle system, it makes no difference whatsoever in the cpu time the niced process gets…

    In most biosses you can see the cpu temp. run a compile, and after a segfault, reboot and check the temp’s… if they run high, that’s your problem. You might be able to let the computer beep if the cpu temp gets above a certain treshold, btw.

  13. Posted June 21, 2006 at 10:57 pm | Permalink

    But we have a CPU Info applet as well, somewhere on kde-apps.org. It will show you the temperature (if you get it compiled, that is).

  14. James
    Posted June 22, 2006 at 3:44 am | Permalink

    If make||make||make (etc) completes successfully, are you sure it’s created valid objects? If the computer has a hardware problem it could be doing more than just segfaulting – you may have bad data in memory. In fact, you probably do.

  15. Posted June 22, 2006 at 8:04 am | Permalink

    The times I’ve seen this it’s been heat. When I had bad memory it caused random segfaults in everything, not just GCC.

    You mentioned dust, and mind you, that can significantly increase the heat on the CPU. Specifically heat-sink fins get clogged up with the stuff rendering them pretty ineffective. You can just pop off the heat-sink, the fan and all that jazz and put it under the sink and then let it dry.

    Another thing you can give a whirl is underclocking your system. If that solves the problem then you can be fairly certain that it was a heat issue…

  16. Peter
    Posted June 23, 2006 at 8:20 pm | Permalink

    Re: memtest as someone else mentions: memtest doesn’t end after a while but when you choose to reboot.

    I had faulty RAM that would work ok when cold booting and fail when rebooting my linux server and prechoosing the memtest image in the boot loader. If I used the sequence “halt -p” and powering up the machine right after lights off the fault would not be detected. Symptomes were weird oopses during operations, sometimes halting the machine, sometimes it continued to run.