[ltp] Re: Another one bites the dust

Tue, 20 Mar 2001 14:41:19 -0800

(See below)

On Tue, Mar 20, 2001 at 11:18:07PM +0100, Friedemann Baitinger wrote:
> 
> I agree with you with respect to your statement that disabling I2C in
> the kernel config is just a quick-fix but not a solution. I am currently
> running a T21 with all the I2C options _enabled_ and there is no problem
> with the system. However I have carefully avoided to mess with the
> lm_sensors package on this machine. I could dig out my old writeups but
> as far as I remember, running the 'sensors-detect' is the last thing I
> did.
> 
> What makes it really dangerous is that you don't know that the machine
> is broken until you reboot (actually power off then power on) the
> machine. It could be days after using lm_sensors till you notice that
> the machine is dead. Probably hard to always correlate the problem with
> 'lm_sensors'.
> 
> I'd like to suggest that those people who experienced this problem try
> to do a detailed writeup what happened and post the result here such
> that IBM can reproduce the problem in their labs.

Yes, that would be helpful. :')

> I can give you a few details I remember: What really happens is that
> some data in an I2C attached SEEPROM is wiped out or made inconsistent
> when lm_sensors and/or sensors-detect is run. This data is called VPD
> (Vital Product Data) and it consists of the serial numbers of all
> components of the machine. If this data is corrupt the machine purposly
> refuses to boot because sombebody might have stolen the ThinkPad and is
> now tampering with the hardware in order to access the data on it.
> 
> In essence, the machine is not physically broken, however, it needs to
> be sent in to IBM for repair. If IBM had all the relevant VPD data on
> file then the SEEPROM could be reinitialized quickly. Should it turn out
> that this data is not available for whatever reason (like for my
> machine) then essentially all 'active' hardware like the printed circuit
> boards need to be replaced.

The SEEPROM's I've dealt with require a write and then a read
transaction to perform a read.  I.e.:

I2C-command write + addr
I2C read + (data from seeprom)

This makes read-only I2C transactions (e.g. for probing) very safe.

If their seeproms work the same way, then it would indicate that a
chip driver is accidentially communicating with the seeprom as if it
were the device it was expecting to drive.  Knowing the address of
this seeprom would be very handy if this is the case.

This sort of 'don't find out until reboot' issue first occured with
the seeprom corruption on SDRAM DIMMs (which hold the SPD spec of
various timing and memory configuration details of the DIMM).  That's
why our eepom module is read-only. :')  (Side-note: the specification
is that eeproms on DIMMs be physically configured to be read-only [by
wiring a read-only indication pin a particular way]. I am not aware of
any memory manufacturer which does this.  Probably because it adds an
extra step in manufacturing.) 

Phil

-- 
Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
   phil@netroedge.com -- http://www.netroedge.com/~phil
 PGP F16: 01 D2 FD 01 B5 46 F4 F0  3A 8B 9D 7E 14 7F FB 7A

----- The Linux ThinkPad mailing list -----
The linux-thinkpad mailing list home page is at:
http://www.bm-soft.com/~bm/tp_mailing.html