Novell is now a part of Micro Focus

Disaster Recovery, Part 3 - If You Rebuild It, They Will Come

Novell Cool Solutions: Feature
By Timothy Leerhoff

Digg This - Slashdot This

Posted: 21 Oct 2004

The Disaster (part 3 of 4)

by Timothy A. Leerhoff

(Note: Parts 1 and 2 of this series covered the actual disaster and the hard and soft data recovery.)

Taking Care of Business

We had a lower level that was thigh deep in mucky water - what to do?

The first order of business was to cut the power to the lower level. Even with that, I told the office manager to keep people out of the water because the UPS was under water and the ions in the water would make a circuit from the UPS output to any human wading through the rooms. That was an unacceptable possibility.

We were lucky that one of our tenants was a builder and had experience in this type of recovery.

After we had a professional shut down the power, the builder pulled in two fire hose pumps. We had a 4" and a 3" pump running from 9:00 a.m. to 9:00 p.m. to get the water down to a squish level. That's the point where the carpet makes a squish sound as you walk, and you don't need to wear boots to keep the muck off the shoes.

I was finally able to make an initial damage assessment. Things were not good at all. Besides the equipment and file damage, the walls were soaked with sewage-tainted water over the two-foot mark. All of the power outlets were soaked, the network and phone drops were in the same condition, and all of the furniture was dumpster fodder.

Our environmental remediation company people almost looked like an emergency response team in moon suits. They shoveled up the mud hills that surrounded each of the drains and carried it out by hand in 5 gallon buckets.

They peeled up the carpeting and stripped the lower sections of the wallboard (also known as sheetrock). We discovered the walls contained wood studs, not metal. While this might not sound like a big deal, just consider how deep the sewage seeped into the wood. Furthermore, wet wood requires far longer to dry than steel.

We wanted to err on the safe side with the wiring in the wall. That was the order of the day. All conduit sections were cut well above the water level and replaced. All wires that were even skimmed by the mucky flood waters were pulled out and discarded. The same fate awaited all submerged connectors, including all power outlets, network or phone connectors, etc.

All power cables, network cables, phone cables, and basically anything else touched by the flood were all tossed in the trash. We ended up overflowing several 30 yard dumpsters that had to be specially handled, so the price of disposal went up. This was yet another gash to the company's wallet.

After the soaked furniture was removed, everything was taken, or should I say scraped, off of the floor. The lower half of the walls were stripped, and the sterilization process was started.

Next, we ran multiple dehumidifiers for over a week. We were finally starting to get dry. The cleaning crew started applying mold/germ killer and a sealer to the lower sections of the wall studs. All metal studs that were retained, as well as the building's furnace, were thoroughly steam-cleaned. Between the day of the flood and the day the furnace was fully cleaned, the staff made do with electric space heaters. I had to pull out the plugs on heaters plugged into the same surge suppressor as the PCs.

During the whole process the ventilation fans were running 24/7 to keep the air flowing. This helped dry out the lower level as well as removing fumes from the water and the chemicals being applied so the people working on the upper levels were safe.

Rebuilding Process

The next step was to rebuild the damaged level. One of the biggest concerns I had was the dust from the finishing of the walls. We had to keep the dust out of the upper level. That dust could foul up PC fans and settle on various parts inside conputers, dramatically lowering their cooling capability and reducing their life expectancy.

While the walls were sanded, we had plastic sealed to the walls in both stairwells. We also had to seal the furnace room without starving the furnace of air for the burners or ventilation to the upper level.

An entire wall and the various data line drops were moved to create a more functional design for the office space on the lower level. We ran the carpeting up one wall in the server room to reduce the noise from the plethora of fans in each of the file servers.

A Few Friendly Suggestions

Once we entered "rebuild" mode there was no lack of suggestions by people in the office and from the typical "friend of a friend." Of course I listened to all of them, since you never know when a diamond in the rough may be tossed your way.

The suggestions included:

  • Move the server room upstairs away from flood. I rejected this - in our Minnesota location we are not in a flood plain, and fires or tornados are more of a concern. These disasters would tend to strike an upper level more severely.
  • Hang the servers from the ceilings -- not a bad thing for an earthquake zone. But the last earthquake in this area was about a 2.1 on the Richter scale, 20+ years ago, and hundreds of miles away. No, this is not California. There is a better chance of a fire than a flood, and guess which direction smoke goes ... (up!)
  • Move the servers offsite. While this would remove the servers from a disaster at the main office location, it puts them at a site that could also have its own disaster.
  • Have an on-line data duplication site. Good idea, but cost was the obstacle there. You need another server and a high enough bandwidth WAN link at the remote site, as well as software to do the live duplication.

What I Did ...

In the rebuild I moved some hardware around to be located more logically. The router and firewall were moved into the phone room where the T1 line comes in from the ISP and all of the data drops terminate. I ran 4 lines to the server room so we had some redundancy and availability for future growth.

The rack now had wheels for easy movement. It helps to be able to get behind the servers for cabling purposes rather than having at most 2 feet of room. This helped even more since the server room was now reduced in size. The smaller size was okay with me, as it would no longer be used as a storage room for all sorts of "stuff."

I set the rack up with the top shelf having the server room switch and KVM. The first shelf down held a monitor, keyboard, phone, and notes for the office manager. I also moved the servers to the second shelf down, a few inches above the high water mark. In order to not misbalance the weight (too high) on the rack, I mounted the UPS's on the lower shelf. These are fairly easy to replace at any local computer store. I also placed the network laser printer on the lower shelf. It's the high-volume printer used by the office manager and her staff. This works out well, as the doors into the server room are shared by the office manager and the filing room where her one staff lady sits. She is our "gal Friday" - she does just about every job you can think of in the office except sales.

Lessons Learned

Before the disaster, we had backup tapes and even took some of the Friday backups off-site. What more did we need? The biggest thing I can say is that we thought we were ready for a disaster. Boy, were we wrong!

There was a plan, but nothing was in writing. Nothing had been tested, as we did not have excess hardware available for this purpose. We didn't have emergency vendors lined up to help with hardware replacements (rental or purchase), etc. Hardware is more than just PCs - it includes any and all items needed to get the infrastructure to work. This includes servers, switches, routers, the firewall, desktop PCs, telephone systems, etc.

Just one word could sum up our original plan: MISTAKE!

Disaster Recovery Options

There are two factors to consider when deciding on what types of disaster recovery options you want:

  1. How much data can you stand to lose up until the disaster occurs - a week, a day, an hour, a minute, only one second, or even less?
  2. How long after a disaster do you have to be up and running, or how long after the disaster can your company remain down without major financial loss?

These two questions can and should be asked about each area of the business operations. Decisions need to be made about file servers, network infrastructure, workstations, paper files, staff space, telephone/faxes, etc.

Other articles in this series:

Novell Cool Solutions (corporate web communities) are produced by WebWise Solutions.

© Copyright Micro Focus or one of its affiliates