DR Exercise: Rebuilding The Enterprise - Days 1 & 2
The lack of posts this week was the result of being in the woods for a week; isolated from all forms of media, cell phone communication, basically cut off from the outside world, but I’m back and have a lot to share. The disaster scenario we were instructed to deal with was pretty creative. Basically, the Military missed the satellite they shot down a couple weeks ago, and we had to deal with the radiation positioning, loss of basic infrastructure, toxic fumes, electro magnetic pulse, you name it. People died as a result of the satellite’s impact, running water and electricity ceased to exist, and our primary objective was to rebuild the organization’s IT infrastructure. This involved generating our own power, establishing data connectivity for Internet access and phone communication, providing air conditioning, the whole nine yards.
Sunday, 14:36: Arrived at Recovery Site
The first step was to establish power. To accomplish this, two massive diesel powered generators were used. Bertha, as everyone referred to the main generator, was responsible for providing power to the server room, customer service center, network operations center, Internet cafe, application development area, and sleeping quarters. Bertha was online and running strong within hours of our arrival, so power wasn’t really an issue. Cables were ran where power was needed, and the air conditioning unit for the server room was up and running in no time.![]()
With power and ice cold air conditioning in the server room, my team’s next objective was to start setting up the customer service center. We arranged two tables and ran a source of power to begin work on our first service request; scan all workstations for viruses prior to connecting to the network. To accomplish this, we used a variety of virus definitions slipstreamed into a live CD. The longest scan took over two hours, and we worked through dinner to ensure all workstations were clean. Thankfully, no threats were detected and we finished shortly after 20:00. After building a primitive task list in an effort to maintain some sort of service request organization, I took a shower and was able to lay down for some well deserved sleep around 23:00.
Monday, 05:30: Begin Day 2
After eating a much needed hot breakfast, day two was quickly underway at 06:30. The network infrastructure, security, and disaster preparedness teams worked hard the previous day to get the satellites up and running. With a total of six satellites independently capable of 2 megabits down and 1 megabit up, they were able to aggregate the bandwidth to one central gateway for optimum efficiency. This provided speeds of up to 12 megabits down, and 6 megabits up, which isn’t bad considering we were stuck somewhere in the middle of no where.
Today’s primary objective was to establish internal phone communications, and begin server restoration from back-up tapes. The telephone operations portion of the network team had phones up and running to all major areas of camp by 10:00. Around 14:00, the server team arrived and began working the remainder of the afternoon and long into the night recovering mission critical services. Based on my observations, restoring the parent and child domains for active directory, and getting the two to communicate properly gave them the most trouble. We were without active directory for over 36 hours, so authenticating to damn near everything was impossible during that time. Meanwhile, email, web, and database recovery were all taking place slowly but surely.![]()
While all that chaos was taking place, the customer service center was continuously bombarded with an assortment of miscellaneous requests. Laptops that hadn’t been scanned the day before, random tests from the telco team (can you hear me now?), troubleshooting wireless connectivity issues, operating system restorations, software installations, printer installations, security watch, and dozens of other requests. Since the service center was comprised of only two people for this exercise, the work load was pretty intense, especially since a lot of the work required one (sometimes both) of us to leave our area.
Day 2 Finally Comes to an End
There’s no way to effectively convey in writing the hell endured during the first two days of the exercise. Traditional methods of operation went out the door as we came up with innovative ways to deal with issues as they arose. Standard operating procedures were replaced with creative problem solving and thinking outside of the box. I found myself still working with the server team after midnight trying to learn and provide aid in any way possible. After a short lived hot shower, 05:30 was only a couple hours away, and I eventually went to sleep thinking day three would be a little easier. Oh how wrong I was. Stay tuned for day 3 and possible day 4 tomorrow.



Thank you for your service to this great country!
The DR exercise looks like it went smooth, and seemed to be well organzied. What was the scenario for the rebuild process, such as a natural disaster and all equipment wiped away or just a need to relocate? Did you completely rebuild the network infrastructure? Exchange, Blackberry, AD etc.. or were bareboned servers provided in which backups were just restored? Also, was this a single instance DR exercise relying solely on your DPC staff, or do you pay for a 3rd party company to provide assistance (such as Sunguard)?
was this exercise supposed to take over a month? How come there’s been no recent posts? Can we start making up theories as to what happened on day 3?