|C H A P T E R 2|
Server reliability is dependent upon a stable environment. The design of the environmental control system for your data center must ensure that each server can operate reliably while remaining within the range of its operating specifications.
Accurate and comprehensive monitoring of environmental support equipment and in-room conditions is extremely important in a sensitive data center environment. The monitoring system should have historical trend capabilities. Analyzing historical trend information is instrumental when determining seasonal changes or other contributing influences. Also, the environmental control system should have critical alarm capabilities. The system must be able to notify the appropriate personnel when conditions move outside of the systems' established operating specifications.
Topics in this chapter include:
The site planning product specifications lists the environmental specifications for your server. These specifications might seem broad for data center equipment. However, the operating ranges apply to the absolute hardware limits and the extreme ranges should not be considered guidelines for normal, continuous operation. While the servers can operate in diverse locations and within a wide range of environmental conditions, stringent control over temperature, humidity, and airflow is necessary for optimal server performance and reliability.
An ambient temperature range of 21 to 23 oC (70 to 74 oF) is optimal for server reliability and operator comfort. While most computer equipment can operate within a rather broad range, a temperature level near 22 oC (72 oF) is desirable because it is easier to maintain a safe associated relative humidity level at this temperature. Further, this recommended temperature provides an operational buffer in case the environmental support systems are down.
Note that the operating temperature range for the servers is either 5 to 40 oC
(41 to 104 oF) or 5 to 35 oC (41 to 95 oF). These temperatures apply to the air taken in by each server at the point where the air enters the server, and not necessarily the temperature of the air in the aisles. Ensure that the air intake temperature is within the operating range of the server. See Equipment Installation Environmental Tests.
Aisle temperatures can give you a first-level alert to conditions in the data center. In a hot-aisle/cold-aisle cabinet layout, verify that the temperatures within the cold aisles are also within the servers' operating temperature ranges. These measurements are necessary because temperatures in the data center are different depending on where in the room the measurements are taken. The heat load in the data center can vary as a result of the density of heat-producing equipment located within the room. Avoid placing temperature sensors in areas that are exposed to drafts or other uncontrolled airflow. See Creating a Hot-Aisle/Cold-Aisle Layout and Facility Environmental Tests.
Also measure the rate of temperature changes within a 60-minute period. Conditions should not be allowed to change by more that 5.5 oC (10 oF) or 10% relative humidity during a 60-minute period. If you detect fluctuations, measure conditions over a
24-hour period and compare results against historical data to analyze trends.
Also avoid cooling short cycles, which can occur if perforated tiles or grilled tiles are placed between the air conditioners and the nearest heat-producing equipment. If tiles are laid out in that way, cold air returns to the air conditioner without circulating through the equipment. The air conditioner might register that temperatures in the room are cooler than is actually the case. The air conditioner might cycle out of its cooling mode while temperatures in the room still call for cooler air.
Relative humidity (RH) is the amount of moisture in a given sample of air at a given temperature in relation to the maximum amount of moisture that a sample could contain at the same temperature. A volume of air at a given temperature can hold a certain amount of moisture. Because air is a gas, it expands as it is heated. As air gets warmer, its volume increases and the amount of moisture it can hold increases, thus causing its relative humidity to decrease.
Ambient relative humidity levels between 45% and 50% are most suitable for safe server operations. This optimal range also provides the greatest operating time buffer in the event of an environmental control system failure.
Data center equipment is particularly sensitive to high humidity levels. When relative humidity levels are too high, water condensation can occur, which can lead to hardware corrosion problems.
Further, maintaining a relative humidity level between 45% and 50% helps avoid server damage or temporary malfunctions caused by intermittent interference from electrostatic discharge (ESD), which occurs when relative humidity is too low. Electrostatic discharge is easily generated and less easily dissipated in areas where the relative humidity is below 35%, and becomes critical when relative humidity drops below 30%.
Though the 20% to 80% RH operating specifications for the servers are wide, conditions should be maintained near the optimal relative humidity levels. Extremes within the 20% to 80% RH range can lead to unacceptable conditions. For instance, if very high temperatures are maintained with very high humidity levels, condensation can occur, which can cause corrosive equipment damage. If very low temperatures are maintained with very low humidity levels, even a slight rise in temperature can lead to unacceptably low relative humidity levels.
It is also imperative that sensors on humidifiers are calibrated correctly. If one unit is calibrated to add humidity, and an adjacent unit is calibrated to remove humidity, energy is wasted and an unacceptable environment can occur.
The temperature and humidity in the data center have a direct relationship to the proper functioning of the installed servers. Data center managers need to be proactive by continually monitoring data center conditions. Regularly scheduled temperature and humidity measurements are one way that data center managers can troubleshoot environmental conditions.
In the ASHRAE report, "Thermal Guidelines for Data Processing Environments" (you can find this report at http://www.ashrae.com), three types of data center temperature and humidity tests are suggested:
These tests are described in the following sections.
Facility environmental tests are designed to measure ambient temperature and humidity throughout the data center in order to avoid environmental-related equipment problems. These measurements provide an overall assessment of the facility and ensure that the temperature and humidity of air in the cold aisles are within the servers' recommended operating ranges.
Knowing the temperature and humidity of the facility also gives you a general assessment of how the HVAC systems are functioning and how much cooling capacity is available to expand the facility.
To measure the ambient temperature and humidity of the data center, follow these guidelines:
These measurements will provide you with a detailed and representative profile of the temperature and humidity of air in the cold aisles. By continually monitoring temperature and humidity within the aisles, you can guard against changes that could affect the servers' optimal environmental ranges. If any measurements are outside of the servers' optimal operating ranges, data center managers must identify the source of, and correct, the problem.
It is also important to measure the temperature and humidity of the return air in front of the HVAC systems. If the return air is below the ambient temperature of the cold aisles, it might mean that cold-aisle air is short cycling, that is, returning to the HVAC units before passing through the servers.
Equipment installation environmental tests are used to ensure that servers are properly installed and laid out within the facility. These tests measure the temperature and humidity of the air immediately in front of the servers or cabinets. Unacceptable environmental conditions can occur if racks of servers have mixed airflow patterns, if cabinets are not properly vented, or if high-density servers are laid out too closely, causing hot spots.
To measure the temperature and humidity in front of the installed servers, follow these guidelines:
For example, if there are ten servers in the rack, measure the temperature and humidity at the mid-point of the servers at 2 inches (5 cm) from the front of the first, fifth, and tenth server, bottom to top, in the rack.
All temperature and humidity measurements should be within the servers' recommended operating ranges. If the environmental levels are outside of these ranges, data center managers should reevaluate airflow patterns and equipment layout, and determine whether the required cold air is available to the servers.
Equipment failure environmental tests can help you determine whether the server failure was due to environmental conditions. These tests are similar to the equipment installation environmental tests, except that temperature and humidity measurements are isolated to the failed server. These tests can help you determine whether the air intake to the server is within the server's recommended temperature and humidity ranges.
To measure the temperature and humidity of air in front of a failed server, follow these guidelines:
All temperature and humidity measurements should be within the recommended operating range of the server. If all measurements are within this range, environmental conditions are probably not the cause of the server failure.
Data centers have different cooling and airflow capacities, often depending on when the data center was built and the requirements it was designed to meet. When designing a data center, you should consider the facility's heating, ventilation, and air conditioning (HVAC) capacity so that equipment in fully populated cabinets and racks can be adequately cooled. The air conditioners need to be set accurately with a sensitivity of +/- 1 oC (+/- 2 oF).
Typically, a cabinet footprint requires 12 square feet (1.115 sq. m). However, cooling measurements are calculated using the gross square footage required by the cabinets or racks, which is not just the area where cabinets or racks are located. The measurement includes aisles and areas where power distribution, ventilation, and other facility equipment is located. Gross square footage is estimated to be 20 square feet (1.858 sq. m) per cabinet or rack.
For example, a data center might provide 100 watts per square foot of cooling capacity using air conditioners. Based on 100 watts per square foot and 20 square feet
(1.858 sq. m) per cabinet, each cabinet is allowed a cooling capacity of 2000 watts (100 watts x 20 sq. ft.) or 2 kW. Remember, 2 kW per cabinet gross square footage in a data center is only an example. Some cabinets might require 3 kW or more of cooling capacity. Some dense computing equipment, such as blade servers in a rack, can require 10 kW or higher of cooling per rack. See Heat Output and Cooling for more information about cooling requirements.
It is also important to consider the intake and discharge airflow required to cool the servers. Most all Sun servers draw in ambient air for cooling from the front and discharge heated exhaust air to the rear. Ensure that the air conditioning equipment can adequately move air down the aisles so that heated air does not flow over the cabinets and racks to the front of the servers.
Measure airflow speed in different zones of the floor to determine whether the existing airflow pressure is sufficient to provide the necessary conditioned air to the servers. Take measurements every 13 to 16 feet (4 to 5 m). Measurements taken at lesser distances might not detect a significant pressure difference. A typical airflow speed ranges between 10 to 13 feet (3 to 4 m) per second.
Adequate airflow speed will facilitate the required delivery of conditioned air down the cold aisles and to the servers. If airflow pressure is inadequate, the conditioned air will heat up before it reaches the areas in need of cooling. While an office environment might require only two air changes per hour, the high-density heat load in a data center can require as many as 30 air changes per hour.
See Cabinet Location for information about how to locate cabinets and racks in the data center to ensure proper aisle airflow.
When determining how long you must allow a server to acclimatize after delivery to the data center, and before power can be applied to the server without causing damage, you should compare the temperature and humidity of the environment in which the server had been stored to the conditions in the data center. Equipment damage can occur if the rate of temperature or humidity change is too great.
The maximum positive or negative temperature gradient that is recommended for multilayered boards is approximately 2 oC (4 oF) per hour. The same consideration applies to humidity; it is best to have a slow rate of change.
If it is necessary to compensate for significant temperature or humidity differences between the servers and the data center, place the servers, in their shipping containers, in a location that has a similar temperature and humidity environment as the data center. Wait at least 24 hours before removing the servers from their shipping containers to prevent thermal shock and condensation.
Make sure that your installation adequately protects equipment from excessive vibration and shock. When installing servers of different types in the same cabinet or rack, be sure that the overall vibration and shock characteristics do not exceed those of the server with the lowest vibration and shock specifications.
For example, if you are installing two different types of servers in the same cabinet, and one server can tolerate 4 g peak shock, and the other server can tolerate 10 g peak shock, make sure that vibration of your cabinet does not exceed 4 g peak shock. The site planning product specifications provides vibration and shock specifications for your server.
The impact of contaminants on sensitive electronic equipment is well known, but the most harmful contaminants are often overlooked because they are so small. Most particles smaller than 10 microns are not visible to the naked eye. Yet it is these particles that are most likely to migrate to areas where they can do damage.
Some sources of contaminants include the following:
A fire in the data center can cause catastrophic damage to the equipment and the building structure. Take the following precautions to minimize the risk of a fire:
The cabinet or rack must meet Underwriters Laboratories, Inc. and TUV Rheinland of N.A. requirements for fire containment. See the server documentation for specific requirements.