[ltp] Random rfkill toggling on an X201s

Henrique de Moraes Holschuh linux-thinkpad@linux-thinkpad.org
Sat, 25 Jun 2011 21:41:50 -0300


On Sat, 25 Jun 2011, Nathaniel Smith wrote:
> On Sat, Jun 25, 2011 at 1:52 PM, Henrique de Moraes Holschuh
> <hmh@hmh.eng.br> wrote:
> > On Sat, 25 Jun 2011, Nathaniel Smith wrote:
> >> I figure this is probably either some flaky hardware, or some kind of
> >> kernel bug. Does anyone have any idea how to debug further?
> >
> > It is flaky hardware.
> >
> >> 22785.447378: bluetooth transmitter disappears from the USB bus
> >
> > This can only happen when the hardware kill switch is triggered (or if you
> > tell the EC to do it, perhaps.  Still, the kernel does _not_ know how to do
> > that, so...)  I'd check that switch and its wiring for damage.
> 
> Hate to contradict you, since I know you're way more of an expert here

Feel free to, especially when I am wrong about something :-)

> Jun 25 14:02:00 ged kernel: [36500.181697] thinkpad_acpi:
> tpacpi_rfk_hook_set_block: request to change radio state to blocked
> Jun 25 14:02:00 ged kernel: [36500.235129] usb 1-1.4: USB disconnect, address 33
> 
> So it looks like the kernel's rfkill support can and does knock the
> bluetooth transmitter off the bus without any hardware switches being
> triggered?

You're correct.  I had forgotten about it, damn.

Compile thinkpad-acpi in verbose debug mode, and give it debug=0xffff, it
will log everything it can about what is happening.

Unfortunately, the rfkill core glue in thinkpad-acpi does not printk (log)
much.

tpacpi_rfk_hook_set_block() is your hint that thinkpad-acpi is geting
COMMANDS to change radio state from the rfkill core.  It can happen normally
when you toggle the radio-kill switch due to feedback (but the driver is
prepared to deal with that, it syncs itself first, THEN updates the core, so
that it will just ignore duplicated/feedback block requests).

> OTOH, in this case we see the message from thinkpad_acpi coming
> *before* the bluetooth device goes away, whereas when it was acting
> up, that message came *after* the wifi and usb subsystems had started
> reacting. I assume that the kernel is poking the EC or something to
> disable bluetooth -- do we know that this poking can only happen after
> the tpacpi_rfk_hook_set_block message is logged?

On most thinkpads with the phisical radio-kill switch, when you toggle that,
the EC kills all radios by itself THEN generates an event, which
thinkpad-acpi will notice.

thinkpad-acpi will then update the rfkill core, which can (and is supposed
to) futher propagate the radio-kill command to anything that is not already
blocked by the EC (e.g. your WiFi card).

The function that does this is tpacpi_send_radiosw_update() in the
thinkpad_acpi.c. file, if you want to take a look at it.

> Okay, new question: assuming that this is the issue, does anyone have
> any suggestions on convincing Lenovo to fix this for me? I can check
> the wiring, but I'm not going to screw around with reflowing the

Now that you have reminded me of how my own driver works :)  It is possible
to further debug the driver to make sure it is not yet another braindamaged
abuse of the rfkill core by something.

> solder. The machine *is* under warranty, but I haven't dealt with
> Lenovo's warranty system before, and IME most places don't make it as
> simple as calling up the relevant department and saying "hey, here's a
> laptop with no hard-drive in it, you need to replace the following
> part numbers for me"...?

Do not send it in for repair until you're SURE it is broken.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh