Troubleshooting System Cooling Issues
Maintaining the proper internal operating temperature of the server is crucial to the health of the server. To prevent server shutdown and damage to components, you need to address overtemperature and hardware-related issues as soon as they occur. If your server has a temperature-related fault, use the information in the following table to troubleshoot the issue.
Cooling Issue | Description | Action | Prevention |
---|---|---|---|
External Ambient Temperature Too High |
The server fans pull cool air into the server from its external environment. If the ambient temperature is too high, the internal temperature of the server and its components increases. This can cause poor performance and component failure. |
Verify the ambient temperature of the server space against the environmental specifications for the server. If the temperature is not within the required operating range, remedy the situation immediately. |
Periodically verify the ambient temperature of the server space to ensure that it is within the required range, especially if you made any changes to the server space (for example, added additional servers). The temperature must be consistent and stable. |
Airflow Blockage |
The server cooling system uses fans to pull cool air in from the server front intake vents and exhaust warm air out the server back panel vents. If the front or back vents are blocked, the airflow through the server is disrupted and the cooling system fails to function properly causing the server internal temperature to rise. |
Inspect the server front and back panel vents for blockage from dust or debris. Inspect the server interior for improperly installed components or cables that can block the flow of air through the server. |
Periodically inspect and clean the server vents using an ESD certified vacuum cleaner. Ensure that all components, such as cards, cables, fans, air baffles and dividers are properly installed. Never operate the server without the top cover installed. |
Cooling Areas Compromised |
The air baffle, component filler panels, and server top cover maintain and direct the flow of cool air through the server. These server components must be in place for the server to function as a sealed system. If these components are not installed correctly, the airflow inside the server can become chaotic and non-directional, which can cause server components to overheat and fail. |
Inspect the server interior to ensure that the air baffle is properly installed. Ensure that all external-facing slots (storage drive, PCIe) are occupied with either a component or a component filler panel. Ensure that the server top cover is in place and sits flat and snug on top of the server. |
When servicing the server, ensure that the air baffle is installed correctly and that the server has no unoccupied external-facing slots. Never operate the server without the top cover installed. |
Hardware Component Failure |
|
Investigate the cause of the overtemperature event, and replace failed components immediately. See Troubleshooting Server Hardware Faults. |
Component redundancy is provided to allow for component failure in critical subsystems, such as the cooling subsystem. However, once a component in a redundant system fails, the redundancy no longer exists, and the risk for server shutdown and component failures increases. Therefore, it is important to maintain redundant systems and replace failed components immediately. |