[ltp] Re: [PATCH] In-kernel automatic fan control

Yury Polyanskiy linux-thinkpad@linux-thinkpad.org
Thu, 01 Dec 2005 00:26:51 -0500


Ok, I agree with most of the points you made. It is indeed should be
more complicated then what was in original patch.

Currently my proposal is:
 1. Use separate thresholds for battery (hardcoded at 40, 45)
 2. Output a huge warning on T43's about non-well understood sensors.
 3. Still use those 0xc0-0xc2 in maximum temp computation.
 4. Some algorithm for 2 step fan speed control (I almost do not hear
the noise difference between level 3 and level 6 so I don't see the
point in using level 3).

Why do I insist on not using separate thresholds for each sensor? 

First, because it'll require too much work for maintaining a list of
models, maps of sensors, reasonable thresholds etc.

Second and more important, I don't believe any chip will really be
seriously damaged by temperatures around 60 C. But then how come all
those super-smart IBM/Lenovo engineers put thresholds so much below? 

Here is the explanation I think.

In this thread where people discussed fan noise issue with Andy Fisher
(who is HP engineer):
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=853249

One important Andy's response to users concerned about too conservative
thresholds: "The real concern is skin temperature. We don't want the
keyboard feeling hot."

I think the same is true for IBM. 0xC1 sensor some suggest is located
near the touchpad and EC is quite concerned about keeping this value
low. Pretty much correlated with Andy's comment?


Yury.


On Wed, 2005-11-30 at 21:25 +0200, Shem Multinymous wrote:
> Hi,
> 
> A few comments about Yury's patch and the ensuing discussion:
> 
> Looking only at the maximum temperature is dangerous. Each sensor
> monitors a different component, and each component has a different
> spec and different cooling system. For example, the CPU is rated to 97
> degrees Celsius, the 2200BG mini-PCI card is rated at 80 degrees and
> the battery will die out quite quickly at 50 degrees. The embedded
> controller uses different thresholds for each sensor, for a good
> reason.

> Yury's patch only monitors the 8 temperature sensors recognized by
> ibm-acpi (embedded controller offsets 0x78-0x7F). There are at least 3
> more sensors, at EC offsets 0xC0 to 0xC3 (and 0xC4-0xC7 look like
> they're reserved for more sensors). It has been observed by several
> people that the BIOS often steps up the fan speed because of these
> extra sensors. You mustn't ignored them, and you mustn't apply the
> too-generous CPU thresholds to them either. See this discussion:
> 
> http://www.thinkwiki.org/wiki/Talk:Problem_with_fan_noise#Secret_sensor_and_the_cause_of_fan_always_on 
> 
> As Paul said, it's much better to keep a stable low fan speed than to
> have bursts of high speed - due to fan wear, thermal expansion and
> (subjectively) annoyance effect.
> 
> Several people here referred to the 2nd and 3rd sensors (0x79 and
> 0x7A) as "MiniPCI" and "HDD respectively". This is NOT generally true.
> On the T43, sensor 0x79 seems to be in the HDAPS accelerometer chip
> (which sits between the Mini-PCI and the well-ventilated PCMCIA slots,
> so it doesn't reflect Mini-PCI temperature well), and sensor 0x7A
> gives values that are completely uncorrelated with the HDD temperature
> reported by the HDD's SMART interface. For the best available guess
> about the location of the sensors and the appropriate temperatures,
> see the top of the script mentioned earlier:
> 
> http://www.thinkwiki.org/wiki/ACPI_fan_control_script#Variable_speed_control_scripts 
> 
> Anyway, fan control can be done in kernel despite this complication -
> the core algorithm used by the above script, stripped of comments and
> support code, is just a dozen lines of code. Performance-wise, a
> kernelspace governer is unnecessary (it wakes up only every few
> seconds anyway, just like the embedded controller). Yes, it could
> prevent things like ungraceful death by OOM (graceful death is fine -
> the above script restore the default fan behavior when it's killed).
> But if you care only about ungraceful death then it suffices to have a
> tiny kernel patch that kicks the fan to default speed if it didn't get
> a keep-alive signal (say, a write to /etc/acpi/ibm/fan) for N seconds.
> 
> For userspace control of fan speed, there's this patch:
>   http://www.thinkwiki.org/wiki/Patch_for_controlling_fan_speed
> At the bottom of that page you can find the relevant hardware specs
> (including a surprise). But if you don't care about a bit of ugliness,
> you don't even need this patch - just force a certain speed this way:
> 
> echo 0x2F 0x00 > /proc/acpi/ibm/ecdump   # fan off
> echo 0x2F 0x01 > /proc/acpi/ibm/ecdump   # fan at low speed
> echo 0x2F 0x04 > /proc/acpi/ibm/ecdump   # fan at medium speed
> echo 0x2F 0x07 > /proc/acpi/ibm/ecdump   # fan at high speed
> echo 0x2F 0x80 > /proc/acpi/ibm/ecdump   # back to automatic speed
> 
> Have fun and don't fry your laptop! If it works for you, please report
> your ThinkPad model on the Wiki.
> 
>   Shem
> 
> P.S. there's also a ThinkPad fan control tool for Windows based on the
> above specs:
>   http://forum.thinkpads.com/viewtopic.php?t=17715
>   http://forum.thinkpads.com/viewtopic.php?t=17733
>