[ltp] T40: airo_mpi death under high load? (fwd)
Fabrice Bellet
linux-thinkpad@linux-thinkpad.org
Thu, 14 Aug 2003 17:57:08 +0200
On Wed, Aug 13, 2003 at 08:07:00PM +0100, honey@gneek.com wrote:
> I really didn't expect this, and was about to report it as a potential
> hardware problem to IBM after struggling with it for days: also
> because the card now sometimes unexpectedly drops out in Windows XP
> too, the little I've used it (Cisco util shows the MAC address
> suddenly as 00:00:00:00:00:00). I'll guess I'll put that down to XP
> not hardware and await your verdict.
>
> Unless of course we both have the same hardware problem under high
> load...
I have the same drop out problems under high load too. Both with the
airo_mpi driver and with the cisco driver too, which is somewhat logical, as
both share a very similar code for tx/rx/interrupt handler. I can only test
with Linux.
Did you try the cisco driver for linux in the same load conditions ? And
is it more stable ?
Mine quietly stops working with a message "NETDEV WATCHDOG:
eth1: transmit timed out" in my logs. Trying to remove the driver in this
state usually freezes the machine. I made a lot of tests during several
hours, and another typical error message of the cisco driver is :
Aug 14 14:28:52 localhost kernel: Command int!
Aug 14 14:28:52 localhost kernel: Link stat int ls=8001
Aug 14 14:28:52 localhost kernel: No carrier
Aug 14 14:30:56 localhost kernel: venuscommand cmd = 21
Aug 14 14:30:56 localhost kernel: venuscommand status = 8914
Aug 14 14:30:56 localhost kernel: venuscommand Rsp0 = ec45
Aug 14 14:30:56 localhost kernel: venuscommand Rsp1 = 428b
Aug 14 14:30:56 localhost kernel: venuscommand Rsp2 = 8918
Aug 14 14:30:56 localhost kernel: venuscommand cmd = 21
Aug 14 14:30:56 localhost kernel: venuscommand status = 7f21
Aug 14 14:30:56 localhost kernel: venuscommand Rsp0 = 6
Aug 14 14:30:56 localhost kernel: venuscommand Rsp1 = 5489
Aug 14 14:30:56 localhost kernel: venuscommand Rsp2 = 424
This problem is quite hard to diagnose, because we cumulate difficulties :-)
1. no documentation is available, so the meaning of these status registers
is unknown.
2. not always the same crash profile.
3. not easily reproductible.
Best wishes,
--
fabrice