[ltp] T60p idle GPU temperature?

Henrique de Moraes Holschuh linux-thinkpad@linux-thinkpad.org
Thu, 12 Jul 2012 12:11:11 -0300

On Thu, 12 Jul 2012, Alex Deucher wrote:
> On Thu, Jul 12, 2012 at 12:16 AM, Henrique de Moraes Holschuh
> <hmh@hmh.eng.br> wrote:
> > On Mon, 09 Jul 2012, Alex Deucher wrote:
> >> On Sat, Jul 7, 2012 at 11:38 AM, Henrique de Moraes Holschuh
> >> <hmh@hmh.eng.br> wrote:
> >> > On Sat, 07 Jul 2012, Alex Deucher wrote:
> >> >> > The Linux ATI framebuffer/DRM power management is crap, at least for the
> >> >> > X300/X600/X1500/X1600 ATIs, so it really runs the GPU a lot hotter (and
> >> >> > wastes a lot more power) than what non-KMS X.org used to do, and let's not
> >> >> > even compare it to what fglrx could do...
> >> >>
> >> >> The radeon KMS and non-KMS pm support is mostly identical.  If you are
> >> >> experiencing differences, it's probably due to the fact that KMS
> >> >> utilizes the GPU more readily than UMS did.
> >> >
> >> > Dynamic clocks worked somewhat well with non-KMS.  This is not true for KMS
> >> > IME, at least not for the R300 family.
> >>
> >> The UMS dynamic clocks option just used a lower clock speed when the
> >> displays were off.  The KMS code does the same thing with both dynpm
> >> and profiles.  KMS uses the GPU a lot more than UMS did (uses the GPU
> >> to accelerate GPU buffer moves, dynamically uses GART, etc.).
> >
> > Hmm, that means something in X.org is keeping KMS busier than it should.
> Actually you have to switch to a profile other than default in order
> to clock down when the displays are off.  Try auto.

I've been using the "low" profile... most of the time I need only very basic
performance, just for the composition manager.

> >> > That said, you're probably aware that switching power profiles in the
> >> > current KMS code (well, up to kernel 3.0 KMS code to be exact) causes some
> >> > sort of PCIe transient error in several ThinkPads, that results in unhandled
> >> > NMIs (reasons a0 and b0 on a T43 2687).  This has been reported, but no
> >> > solution has been found so far.  I'm living with it, since it doesn't seem
> >> > to cause any instability to the box, but it is annoying :-(
> >>
> >> It's changing the number of PCIE lanes that causes this.  You can
> >> disable that code (rv370_set_pcie_lanes()).  The host bridge doesn't
> >> seem to like having the lanes switched on it when it's running and
> >> generates the NMI.
> >
> > Nah, it already draws too much power as it is, can't have extra PCIe lanes
> > enabled.  Did you ask any of the Intel regulars for some help with the
> > 915PM?  Or does it happen only on thinkpads, and therefore the thinkpad
> > firmware is to blame?
> I don't know if it's specific to thinkpads or not (or even certain
> chipsets).  I've never had the time to debug it in detail.  IIRC, the
> UMS radeon driver has identical pcie lane code.  Did you have similar
> issues with UMS and pcie lane switches?

No.  Either it never switched the number of lanes, or it somehow avoided
whatever causes the Intel 915PM in my T43 to SERR.  If the code is exactly
equal, and it DID change the PCIe link width just like modern KMS does, then
the usermode->kernel mode transition was somehow helpful.

The 915PM/GM errata doesn't have anything directly connected to this problem
at first glance, but I understand little of PCIe to really be sure about it.

  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh