Maintaining the proper internal operating temperature of the storage server is crucial to the health of the storage server. To prevent storage server shutdown and damage to components, address over temperature and hardware-related issues as soon as they occur. If your storage server has a temperature-related fault, the cause of the problem might be:
Storage server component cooling relies on the movement of cool air through the storage server. The cool air is pulled into the storage server from its external environment. If the ambient temperature of the storage server's external environment is too high, cooling does not occur, and the internal temperature of the storage server and its components increases. This can cause poor storage server performance or a failure of one or more components.
Action: Check the ambient temperature of the storage server space against the environmental specifications for the storage server. If the temperature is not within the required operating range, remedy the situation immediately.
Prevention: Periodically check the ambient temperature of the storage server space to ensure that it is within the required range, especially if you have made any changes to the storage server space (for example, added additional storage servers). The temperature must be consistent and stable.
The storage server cooling system uses fans to pull cool air in from the storage server front intake vents and exhaust warm air out the storage server back panel vents. If the front or back vents are blocked, the airflow through the storage server is disrupted and the cooling system fails to function properly causing the storage server internal temperature to rise.
Action: Inspect the storage server front and back panel vents for blockage from dust or debris. Additionally, inspect the storage server interior for improperly installed components or cables that can block the flow of air through the storage server.
Prevention: Periodically inspect and clean the storage server vents using an ESD certified vacuum cleaner. Ensure that all components, such as cards, cable, fans, air baffles and dividers are properly installed. Never operate the storage server without the top cover installed.
To function properly, the storage server has cooling areas that are maintained by an air baffle, component filler panels, and the storage server top cover. These storage server components need to be in place for the storage server to function as a sealed system. If internal cooling areas are compromised, the storage server cooling system, which relies on the movement of cool air through the storage server, cannot function properly, and the airflow inside the storage server becomes chaotic and non-directional.
Action: Inspect the storage server interior to ensure that the air baffle is properly installed. Ensure that all external-facing slots (storage drive, DVD, PCIe) are occupied with either a component or a component filler panel. Ensure that the storage server top cover is in place and sits flat and snug on top of the storage server.
Prevention: When servicing the storage server, ensure that the air baffle is installed correctly and that the storage server has no unoccupied external-facing slots. Never operate the storage server without the top cover installed.
Components, such as power supplies and fan modules, are an integral part of the storage server cooling system. When one of these components fails, the storage server internal temperature can rise. This rise in temperature can cause other components to enter into an over-temperature state. Additionally, some components, such as processors, might overheat when they are failing, which can also generate an over-temperature event.
To reduce the risk related to component failure, power supplies and fan modules are installed in pairs to provide redundancy. Redundancy ensures that if one component in the pair fails, the other functioning component can continue to maintain the subsystem. For example, power supplies serve a dual function; they provide both power and airflow. If one power supply fails, the other functioning power supply can maintain both the power and the cooling subsystems.
Action: Investigate the cause of the over-temperature event, and replace failed components immediately. For hardware troubleshooting information, see Troubleshooting Storage Server Hardware Faults.
Prevention: Component redundancy is provided to allow for component failure in critical subsystems, such as the cooling subsystem. However, once a component in a redundant system fails, the redundancy no longer exists, and the risk for storage server shutdown and component failures increases. Therefore, it is important to maintain redundant systems and replace failed components immediately.