We had an ongoing issue with a customer’s HP server where the internal fans continually ran at full RPM. We had to move the server to a new location because the noise was too much for the employees. The HP monitoring software would shut down the server occasionally because it senses it over heating, but there was never any real sign or indication that there was an overheating issue. The problem typically occurred when backups were running so we thought it was possibly the tape drive was causing a faulty temperature reading.
We went as far as to purchase a USB temperature logger which I placed on the server to monitor the environment for a week. All readings came back normal. I opened a case with HP Support and their recommendation was to update the firmware and the drivers and everything else they could think of. But nothing they suggested made a difference. [more]
I decided to take the server down and look at the internal parts for possible obstructions in air flow that would cause it to think it was overheating. I was checking the second processors heat sink I noticed it was not seated exactly right but was clamped down. I removed the heat sink and found dust under it. That’s right... dust between the CPU and the silver paste. As you can tell from the picture below the silver paste had never contacted the CPU, except on one corner. I grabbed some canned air, blew the dust off, and reseated the heat sink. Closed up the server and started it up. Since that time the server has run super quite with no thermal issues to this day. However, HP did have to replace an internal fan that failed from running so long at high RPM.