[ltp] 770X Error codes?

Fabrice Bellet linux-thinkpad@www.bm-soft.com
Sun, 16 Jul 2000 00:13:10 +0200


On Sat, Jul 15, 2000 at 09:08:43PM +0200, Bill Mair wrote:
> Fabrice Bellet wrote:
> 
> When did you do this ?
> And when were you aware of this problem ?
> Why didn't you warn us here on the list that playing around with the I2C
> lm_sensors stuff breaks ThinkPads ?
> 
> I'm now I'm really worried about the 2.4 kernel, if any distribution
> enables
> this stuff by default and probes the I2C bus looking for lm_sensors
> support
> will this then break my ThinkPad ???
> 
> Oh god, life is about to get very complicated installing V2.4.x kernel
> based
> distributions...

The problems occured between november and january. I say "the problems"
because it took me several tries to identify what IMHO was the 
probable cause of these errors.

I already asked about it on Jan 10, you can check the ML archive.
I suggested that 188 & 189 err code could be linked to lm_sensors :

######
# What's the current status of the lm_sensors stuff on thinkpads ?
# Did you succeed in getting it working on your laptop ? If yes,
# what chipset has been detected (lm75, lm78 ..), on which
# model ? The spec sheets of the 770 models talks about a lm75
# chipset. So it should be possible to access sensor information,
# like processor temps and fan rotation speed.
# 
# On my 770Z, It only detects the eeprom module, which is supposed
# to provide an interface to access the eeprom on SDRAM DIMMS, via 
# the SMBus. You can then read misc info about your memory
# chips : latency, delays, capacity, and so on.
# 
# Concurrently, although this may not be definitely linked :-)
# I discovered that the 188 and 189 error codes at boot time --that
# I already experienced several times-- mean : Bad EEPROM CRC :-(
# (The hardware maintenance manual is available in PDF format
# online on IBM web site, and is a great source a information
# on how to diagnose a hardware failure)
# The technician explained to me that this eeprom contains the motherboard
# serial number, and is checked at boot time. Might it be possible
# that this eeprom is accessed in a weird way by the lm_sensors stuff ?
# Any similar experiences ?
#######

I suggested to the lm_sensors maintainer, that there could be
a problem with the eeprom module. He told me that two things could
have happened :
    1 : the SMBus got locked in a state that makes it impossible
        to read from it. the motherboard serial numbers are
	unaccessible. 
	A cold reboot could help : removing all power supply, waiting
	a few minutes and reinserting them.
    2 : EEPROM data got mangled. The maintainer thinks this is very
        unlikely, because the eeprom module in lm_sensors
	has no write capability (a the time of writing, ie January).

The first case didn't really fit with what I experienced, because
removing power supply did not help me. But there may be another
sort of internal battery the PIIX4 bus may be connected on...

My last experience with the IBM technical services was in January.
I went to the IBM customer serices to get my hand back on my laptop.
This was the second or third time it was down due to these 188&189
errors.  I had the intuition it could be caused by lm_sensors,
but I had no proof.  So in front of the technician, I booted the laptop.
It started. I ran sensors_detect, it detected eeproms only. I 
insmoded eeprom module, I did a cat on 
/proc/sys/dev/sensors/eeprom-i2c-*/*, I rebooted and voila :
error 188 and 189. 
I made the mistake not to write down what sensors_detect printed, but
I'm almost sure that there were 4 or 5  eeproms entries in the proc file
although I only have two SDRAM DIMMS on my laptop.
Okay, the link between the failure and the lm_sensors
is not proven, but there's at least for me a high suspicion.

The lm_sensors maintainer suggested me to do further experiments
consisting in skipping some addresses during the sensors detection,
but I must admit that I was not motivated anymore to continue the
experience. Essentially, because the IBM customer services
warned me that the next time I'll bring my laptop with this
same failure, the warranty won't be applied to the reparation :-(

Finally, I reported what I thought was the probable cause of
the failure to the IBM customer services, here in France,
so I assume the information will have propagated, although I never
got any feedback since January...neither from IBM, neither
from the lm_sensors team.

Fabrice

(lm_sensors homepage : http://www.netroedge.com/~lm78/)
----- The Linux ThinkPad mailing list -----
The linux-thinkpad mailing list home page is at:
http://www.bm-soft.com/~bm/tp_mailing.html