Talking about the operation and maintenance of data center power supply system

Publisher:甜美瞬间Latest update time:2011-03-25 Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

Some users think that all UPS are the same, so they pursue the lowest price, which results in failure. For example, a highway command center installed a machine on the first day in order to save money, and a fire broke out on the second day.

1. Purpose of Operation and Maintenance

The reliability of the power supply system in a data center is crucial. As you can imagine, no matter how sophisticated the IT equipment is, how superior the system functions, and how high the reliability is, once there is a power outage, even the best system will not work. Therefore, the importance of maintaining the equipment in operation cannot be ignored, and it can be seen that the burden on the shoulders of maintenance personnel is very heavy.

2. Operation and maintenance tasks and unsolvable problems

In order to ensure the reliable operation of the power supply system, many places have formulated many good measures. But even so, there are still many loopholes. The reliability of the equipment has been determined after leaving the factory. For example, some are inherently deficient, such as some power output isolation transformer windings use aluminum enameled wire instead of cable copper enameled wire, and there is a high probability that something will go wrong when running at full load... However, statistics show that less than 30% of the failures are caused by quality problems of the equipment itself, and 70% of the failures are acquired. That is to say, it is considered a failure, and its manifestations are as follows:

1. Failure caused by improper selection

There are many reasons for improper selection, mainly manifested in:

(1) Basic concepts are unclear, and it is easy to be misled by manufacturers. For example, in the bidding for UPS on a highway, the bid document requires that the UPS should be able to continue to supply power after one or two phases of the input are disconnected and the battery is not discharged. Because some manufacturers advertise that their UPS can still provide 50% of its power supply capacity after one phase of the input is disconnected and the battery is not discharged; after two phases of the input are disconnected, the battery is still not discharged and the UPS can still provide 25% of its power supply capacity, which extends the battery life. Users think this performance is good, and it is not difficult to find its disadvantages with a little thought: if you want to enjoy its advantages, you must buy a UPS with 4 times the load capacity, otherwise it will not be able to carry the current load after one phase is disconnected. On the other hand, what if the UPS disconnects the two lines behind the input switch ? Should it be repaired? When should it be repaired? Can it be repaired only after a complete power outage? How to solve a series of problems such as these? If users really buy such a UPS according to the load capacity, this is a huge hidden danger, which cannot be solved by operation and maintenance.

(2) Reasons that are difficult to explain. For example, some users have been using a certain brand of machine since the last century. At that time, due to objective reasons, they could not and were not convenient to solve the problem, despite the low input power factor, low efficiency, large size, high power consumption and high price. Now, new models that are much better than the original models have already been launched. For example, the new high-frequency structure UPS can save 50,000 kWh of electricity per 100 kilowatts per year compared with the original industrial frequency structure UPS, and this machine room with a capacity of several megawatts can save millions of kWh of electricity every year. However, for some reason, energy-saving equipment was not selected and the energy-consuming machine was still included in the bid. In order to avoid safety, the structural characteristics of the machine were also included in the bid. This not only increased the investment and floor space of the air-conditioning equipment, but also undoubtedly buried hidden dangers for future operation. This is another problem that cannot be solved in operation and maintenance.

(3) Pursuit of low prices. Some users think that all UPS are the same, so they pursue low prices, which results in failures. For example, a highway command center was greedy for cheapness, installed the machine on the first day, and caught fire on the second day; a life insurance company purchased a machine at a low price, but within less than half a year, a UPS failure burned out the input circuits of almost all IT equipment , causing the system to paralyze; another example is a megawatt-level data center with multiple UPS connected in parallel. After only a few months of installation, a power tube of an inverter in one of the UPS broke down, causing all UPS to trip...

2. Failure caused by improper use environment

The machines are not placed according to the environmental requirements in the manual, and some even put UPS in corridors where people walk around or in basements where water drips. For example, several 200kVA UPS were placed in a bungalow with only one layer of prefabricated panels on the roof, and the air conditioners were only two 5P comfort air conditioners. Another example is that a glass factory put UPS in a factory building where powder was flying, etc. This led to frequent failures.

3. Failures caused by an imperfect system

For example, some staff on duty connected electric stoves, rice cookers and vacuum cleaners to the UPS at random, causing overload and tripping; some staff on duty attracted rats to drill into the machine due to their food, causing fire...

4. Handover failure

This type of failure is mainly caused by the fact that the management personnel are not in the same group or do not work well together. For example, in a ticketing system at a railway station, the check-in staff disconnected the external battery pack of the UPS when moving the machine, and did not inform the later staff afterwards, resulting in a power outage of the mains and UPS at the same time...

5. Experience failure

Experience is indispensable and a rare treasure. However, experience is relative, that is, the experience gained on a certain UPS may not be completely suitable for another UPS, otherwise it will cause failure. A telecommunications bureau used the same method to start another brand of machine without reading the manual, causing the inverter to burn out.

6. Oversight of failure

Some components will age or fail prematurely during operation, which will lead to failure if not promptly checked and discovered. These cannot be discovered in automatic monitoring. For example, fuses that begin to bend due to aging, loose screws in battery structures, and tiny cracks in battery shells after long-term storage can all cause failures if not discovered or not promptly handled.

7. Failures caused by hasty implementation

Maintenance should not be done hastily, and should be done after careful consideration. An engineer of a company wanted to repair a UPS that was running. According to regulations, the UPS should be shut down with the maintenance bypass switch before repairing. However, according to the procedure, the automatic bypass should be started first, and then the maintenance bypass switch should be closed. Perhaps the engineer had other urgent matters to deal with, and he closed the maintenance bypass switch without thinking after entering the machine room, resulting in the explosion of the inverter power tube.

8. Secondary failure caused by improper maintenance

Regular maintenance of UPS is necessary, but there should be a strict management procedure. Those who are irresponsible and do not perform regular or irregular maintenance according to the requirements are the main reasons for machine failure. In addition, failures can also occur during maintenance. For example, when using a multimeter probe to measure the potential of a circuit board, the probe short-circuits two points and causes a failure. When a user discharges the battery,

When the battery is removed from the UPS and then connected back after it is discharged, the model is released, causing the current to explode.

Another example is that an engineer accidentally slipped and hit the control board with a spanner when replacing a centrifugal fan. He didn’t pay attention to it at the time. After the fan was replaced, it couldn’t be turned on. After inspection, it was found that a component return was broken...

9. Failures caused by static electricity

A machine room was shut down for maintenance as per routine, but it could not be turned on after maintenance. After inspection, it was found that a component voltage was broken down. Recalling the maintenance process, it was found that the control board was swept with dust with a plastic toothbrush. Plastic can generate several thousand volts of friction static voltage on the surface of the dry device. Since some MOS devices are used in the small signal circuit of the machine, these devices have low withstand voltage and are most afraid of static electricity. It was measured that an ordinary plastic bag can generate 3000V of static voltage when rubbed with a circuit board. Therefore, it is best to wear a grounding ring on your wrist when checking these circuit boards.

10. Failures caused by overconfidence

Confidence is the foundation of success, but overconfidence can sometimes lead to mistakes. For example, an international bank should update its UPS after running it for 8 years, and the manufacturer reminded it many times. Since the UPS had rarely had any problems in the past 8 years, the user's person in charge repeatedly replied "no need to update". A few months later, the UPS stopped supplying power for two hours due to aging failure, causing global business to be interrupted for two hours, resulting in great losses.

According to international statistics, the nominal service life of a battery in 5 years is no more than 3 years. If it is not maintained regularly, it should be replaced in 2 years. The original battery in the terminal of an airport was 4 hours. After 3 years, it was not replaced. Once the external power grid was cut off, the UPS backup time was only 4 hours. The power outage caused losses...

There are many similar man-made failure phenomena, so I will not list them all here.

In the final analysis, the selection of the power supply system is the first hurdle. If this hurdle is not controlled, the seeds of hidden dangers will be planted first. The connection of the power supply system is the second hurdle. If there is no good connection plan with good equipment, hidden dangers will also be buried. A TV station was misled by the manufacturer because of the connection plan. The power supply UPS failures of more than a dozen programs continued, and most of them were safe. This has been the case for several years, making the maintenance personnel nervous and racking their brains. The connection plan is a project, which is not controlled by the maintenance personnel. Helplessly, the manufacturer's engineers were ordered to come on duty during major events and holidays. What's the use of this! The manufacturer's engineers can only give users psychological comfort at this point. The alarm will still be given when it is necessary, just pray that God will not bless the power outage!

Reference address:Talking about the operation and maintenance of data center power supply system

Previous article:Design of UPS system for Beijing Unicom data center
Next article:TVS and its application in circuit design

Latest Analog Electronics Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号