[ltp] T60p idle GPU temperature?

Alex Deucher linux-thinkpad@linux-thinkpad.org
Thu, 12 Jul 2012 11:14:37 -0400


On Thu, Jul 12, 2012 at 11:11 AM, Henrique de Moraes Holschuh
<hmh@hmh.eng.br> wrote:
> On Thu, 12 Jul 2012, Alex Deucher wrote:
>> On Thu, Jul 12, 2012 at 12:16 AM, Henrique de Moraes Holschuh
>> <hmh@hmh.eng.br> wrote:
>> > On Mon, 09 Jul 2012, Alex Deucher wrote:
>> >> On Sat, Jul 7, 2012 at 11:38 AM, Henrique de Moraes Holschuh
>> >> <hmh@hmh.eng.br> wrote:
>> >> > On Sat, 07 Jul 2012, Alex Deucher wrote:
>> >> >> > The Linux ATI framebuffer/DRM power management is crap, at least for the
>> >> >> > X300/X600/X1500/X1600 ATIs, so it really runs the GPU a lot hotter (and
>> >> >> > wastes a lot more power) than what non-KMS X.org used to do, and let's not
>> >> >> > even compare it to what fglrx could do...
>> >> >>
>> >> >> The radeon KMS and non-KMS pm support is mostly identical.  If you are
>> >> >> experiencing differences, it's probably due to the fact that KMS
>> >> >> utilizes the GPU more readily than UMS did.
>> >> >
>> >> > Dynamic clocks worked somewhat well with non-KMS.  This is not true for KMS
>> >> > IME, at least not for the R300 family.
>> >>
>> >> The UMS dynamic clocks option just used a lower clock speed when the
>> >> displays were off.  The KMS code does the same thing with both dynpm
>> >> and profiles.  KMS uses the GPU a lot more than UMS did (uses the GPU
>> >> to accelerate GPU buffer moves, dynamically uses GART, etc.).
>> >
>> > Hmm, that means something in X.org is keeping KMS busier than it should.
>>
>> Actually you have to switch to a profile other than default in order
>> to clock down when the displays are off.  Try auto.
>
> I've been using the "low" profile... most of the time I need only very basic
> performance, just for the composition manager.
>
>> >> > That said, you're probably aware that switching power profiles in the
>> >> > current KMS code (well, up to kernel 3.0 KMS code to be exact) causes some
>> >> > sort of PCIe transient error in several ThinkPads, that results in unhandled
>> >> > NMIs (reasons a0 and b0 on a T43 2687).  This has been reported, but no
>> >> > solution has been found so far.  I'm living with it, since it doesn't seem
>> >> > to cause any instability to the box, but it is annoying :-(
>> >>
>> >> It's changing the number of PCIE lanes that causes this.  You can
>> >> disable that code (rv370_set_pcie_lanes()).  The host bridge doesn't
>> >> seem to like having the lanes switched on it when it's running and
>> >> generates the NMI.
>> >
>> > Nah, it already draws too much power as it is, can't have extra PCIe lanes
>> > enabled.  Did you ask any of the Intel regulars for some help with the
>> > 915PM?  Or does it happen only on thinkpads, and therefore the thinkpad
>> > firmware is to blame?
>>
>> I don't know if it's specific to thinkpads or not (or even certain
>> chipsets).  I've never had the time to debug it in detail.  IIRC, the
>> UMS radeon driver has identical pcie lane code.  Did you have similar
>> issues with UMS and pcie lane switches?
>
> No.  Either it never switched the number of lanes, or it somehow avoided
> whatever causes the Intel 915PM in my T43 to SERR.  If the code is exactly
> equal, and it DID change the PCIe link width just like modern KMS does, then
> the usermode->kernel mode transition was somehow helpful.

Might be the size of the step.  IIRC, the UMS code switched between 16
and 4 lanes, while the KMS code switches between 16 and however many
lanes are specified in the power low mode (usually 1).

>
> The 915PM/GM errata doesn't have anything directly connected to this problem
> at first glance, but I understand little of PCIe to really be sure about it.

I'm not really an expert myself.

Alex