[ltp] Linux kernel instability? (Rant/Panic/Cry-for-help!)

Thomas Kahle linux-thinkpad@linux-thinkpad.org
Tue, 30 Sep 2008 03:46:49 +0200


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sorry for another pointless "me too !" mail, but I have some problems
with the iwl4965-driver. Sometimes it just dies while associating with
some network, then messages like

Sep 30 02:11:35 denkmatte iwl4965: Microcode HW error detected.
Restarting.

keep appearing in the kernel log until one unloads the module.
More annoying is the fact that this can also happen when going to
Suspend to disk which will result in kernel panic and crash.

I tried to build compat-wireless but I just can't find how to get it
compiled. I tried the versions of different days, but gave up.
Any suggestion is appreciated.

The log of the crash looks like this

Sep 30 02:11:35 denkmatte iwl4965: Microcode HW error detected.
Restarting.
Sep 30 02:11:35 denkmatte iwl4965: Microcode HW error detected.
Restarting.
Sep 30 02:11:35 denkmatte iwl4965: Microcode HW error detected.
Restarting.
Sep 30 02:11:35 denkmatte dhclient: can't create
/var/lib/dhclient/dhclient-wlan0.leases: No such file or directory
Sep 30 02:11:35 denkmatte iwl4965: Microcode HW error detected.
Restarting.
Sep 30 02:11:35 denkmatte iwl4965: Microcode HW error detected.
Restarting.
Sep 30 02:11:35 denkmatte iwl4965: Microcode HW error detected.
Restarting.
Sep 30 02:11:36 denkmatte ------------[ cut here ]------------
Sep 30 02:11:36 denkmatte Kernel BUG at c017964d [verbose debug info
unavailable]
Sep 30 02:11:36 denkmatte invalid opcode: 0000 [#1] SMP

Sep 30 02:11:36 denkmatte Modules linked in: uhci_hcd iwl4965(-)
firmware_class mac80211 cfg80211 mmc_block i915 drm ipv6
cpufreq_ondemand snd_pcm_oss snd_mixer_oss snd_seq_oss
snd_seq_midi_event snd_seq snd_seq_device tp_smapi thinkpad_ec uinput
acpi_cpufreq freq_table arc4 ecb crypto_blkcipher snd_hda_intel
yenta_socket snd_pcm sdhci rsrc_nonstatic snd_timer snd ohci1394
ehci_hcd rtc mmc_core pcmcia_core ieee1394 snd_page_alloc sg usbcore
e1000e battery ac video output bay thinkpad_acpi thermal processor
backlight button nvram [last unloaded: uhci_hcd]

Sep 30 02:11:36 denkmatte

Sep 30 02:11:36 denkmatte Pid: 8526, comm: iwl4965/0 Not tainted
(2.6.25.14 #1)
Sep 30 02:11:36 denkmatte EIP: 0060:[<c017964d>] EFLAGS: 00010046 CPU: 0

Sep 30 02:11:36 denkmatte EAX: 00000000 EBX: ead4ab40 ECX: 00000286 EDX:
c10a61c0
Sep 30 02:11:36 denkmatte ESI: c530e000 EDI: 00000086 EBP: 00000000 ESP:
da924eb4
Sep 30 02:11:36 denkmatte DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Sep 30 02:11:36 denkmatte Process iwl4965/0 (pid: 8526, ti=da924000
task=ecb74880 task.ti=da924000)
Sep 30 02:11:36 denkmatte Stack: ead4ab40 da81ae80 da819660 c038fba8
da819660 f9027677 da818dc0 da81ae8c
Sep 30 02:11:36 denkmatte 00000286 00000000 0300300c 00000000 da818dc0
0300300c f9030aae 00000041
Sep 30 02:11:36 denkmatte 00000000 da819334 da819660 00000246 0000eded
00000000 da81cccc c04b93b8
Sep 30 02:11:36 denkmatte Call Trace:
Sep 30 02:11:36 denkmatte [<c038fba8>] <0> [<f9027677>] <0> [<f9030aae>]
<0> [<f9023923>] <0> [<c011ad6e>] <0> [<c0409568>] <0> [<f9027310>] <0>
[<f9027348>] <0> [<c0133682>] <0> [<c0136c20>] <0> [<c0133f88>] <0>
[<c0136a70>] <0> [<c0133ef0>] <0> [<c0136772>] <0> [<c0136730>] <0>
[<c0104bcf>] <0> =======================
Sep 30 02:11:36 denkmatte Code: c3 8b 52 0c 8b 02 25 00 40 02 00 3d 00
40 02 00 75 ca 8b 52 0c eb c5 8d b4 26 00 00 00 00 89 c8 89da e8 b7 fe
ff ff 8b 03 eb c9 <0f> 0b eb fe eb 0d 90 90 90 90 90 90 90 90 90 90 90
90 90 83 ec
Sep 30 02:11:36 denkmatte EIP: [<c017964d>]  SS:ESP 0068:da924eb4
Sep 30 02:11:36 denkmatte ---[ end trace 93e7d1751b78e2d4 ]---


Henrique de Moraes Holschuh wrote:
> If you guys want any of us people closer to the kernel to even participate
> on threads as this, there is an EXTREMELY important information you must
> give:  whether you are using any binary blobs.  Also which kernel, but most
> of you provide that information already.
> 
> Really, because if you use nVidia drivers (hello T61 users), fglrx (hello
> most 3d-using ThinkPad users), madwifi (hello IBM wireless adapter users)...
> your experiences will only be valid for people using the very *same* version
> of the binary blob and mainline kernel: these drivers often do NOT play nice
> with mainline, and they DO regress when you upgrade mainline (or just the
> driver).
> 
> My T43 has been extremely stable, but I play it safe:
> 
> I don't use Ubuntu's patched kernel (I'd use either Fedora's or SuSE's, they
> DO have a team large enough to take care of it.  Ubuntu doesn't).  I sort of
> trust Debian's kernel because it is *really* close to mainline, often it has
> fixes for weird arches (doesn't matter for a x86 build) and minor backports
> of stuff that has been accepted upstream but is not in stable mainline yet.
> The only heavy weight patching I do is tp_smapi, plus whatever crap I am
> working on (currently rfkill and thinkpad-acpi).
> 
> I don't use bleeding edge kernels except for fast testing (laptop production
> is currently at latest 2.6.25 + a few selected patches.  Server production
> is at 2.6.16.y), and take note that I *am* a kernel driver developer.  I
> don't use ANY binary blobs at all, EVER.  I don't use the latest
> fuck-the-box-to-get-some-useless-eyecandy crap from KDE or Gnome, that is
> likely to touch something fragile in X.org.  I don't enable known-unstable
> or still immature stuff in the kernel (PAT, group scheduling,
> virtualization, containers, etc).
> 
> And I never do hibernation, just suspend-to-RAM (VERY often, so I am 100%
> sure it is working perfectly, even if it were a crash in 1000, I'd hit it
> within a week).  No undervolting or screwing up with the clock, either.
> 
> This doesn't mean the quality of mainline's core is not changing for the
> worst, it really might be.  But it DOES mean that there is such a thing as
> "playing it safer"...
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjhhQgACgkQrpEWPKIUt7N4hgCfea3MdEqQXSCTfZXSDjLL+Pgh
Fx0An2dlbz9Ih59bPwKbfj3JsvpVLWy3
=Cx5c
-----END PGP SIGNATURE-----