Webfusion Donhost Network Leeds Data Centre Failure


Webfusion Network Leeds Data Centre Failure - Service OutageToday around 11:30 Webfusion, Donhost and other providers under the same umbrella experienced major service outage taking down all servers in Leeds Data Centre. Thousands of websites and web services were unavailable for nearly 3h. That includes those of our websites that use Leeds Data Centre servers.

[Update 11/01/2011 23:10] Some websites, as reported by other Webfusions / Donhost clients, took much longer to get back on-line. Clarification is needed here: It took over 3h for the first servers to be seen back on-line however many of them still not fully functional.

Please refer to the comments below for the latest reports from affected users.

Would you guess the reason!? I wouldn’t guess. I would rather think of some natural disaster or something… but not this… :)

According to Webfusion, contractors testing fire systems in super secured data centre actually triggered the fire system by mistake. This has started all emergency procedures and shooting down the servers one by one in emergency mode to prevent data loss.

I hope that they did not fire the FM200 gas discharge and sprinklers. Well… It seems that super secure data centres that are built like a bunker resistant to any external threads are very vulnerable to internal threads and human beings.

“We have confidence in the system,” Galvin says. “We’ve had no outages of any significant size — and we could not say that before, not with our individual servers or with our earlier cluster. So we’re seeing a two-fold improvement — in reliability and in focus. We don’t have to worry about what the platform can do.” – source  Case Study: Webfusion places trust in reliable storage platform to deliver bullet-proof innovations in web hosting.

Confidence in the system didn’t help at all. Someone is missing the human factor in his calculations. The system might be “Bullet-proof” according to their own case study but certainly stupid-proof it is not. I suppose today was the last day at work for someone. …happens.


Original system status update from Webfusion:


System Outage

Created: 11 January 2011, 13:46

Last Updated: –

We would like to apologise for todays’ system outage, and to explain why this has occurred.

An external third party was carrying out routine maintenance in our data centre, and testing our systems for fire prevention. Unfortunately, due to human error, our fire prevention systems were in fact triggered.

As a result of this, and acting as the system should in the event of a real fire, all of our servers were sent in to a safe mode whereby they went offline.

Safety is our biggest concern, hence the system is configured to react in this way to avoid a major incident and permanent data loss.

We deeply regret any problems this may have caused you, and assure you we are doing our utmost to return to normal service levels as quickly as we possibly can.


If you have, any updates valuable to this article then please share it here and I will update this post accordingly.

16 Responses to “Webfusion Donhost Network Leeds Data Centre Failure”

  • Steve:

    I know of at least 3 sites still down as a result of this. Saying they are over it in 3 hours is a flat out lie.

  • Tomasz:

    @Steve Same here it took much longer to get the DNS properly running on all websites.

  • Andy:

    I have a dedicated server still down at 1.20am Wednesday. That’s 14 hours since it went down.
    Apparently there are more than 6,000 cpu’s involved.

  • Tomasz:

    @Andy Are you on Donhost or Webfusion?

  • Tomasz:

    @Andy Are you still experiencing problems? I’m trying to determine roughly how long it takes them to deal with all the consequences on the end users. I also have confirmed reports from other friend companies that basically reflecting what you’ve said. Thanks

  • Chris:

    same for us – 5 of 6 servers returned within 15 mins. The sixth has been AWOL since yesterday (11/Jan) lunchtime. Support are overwelmed by the looks of things. I spoke to two engineers last night and they were looking into our server asap.

  • Tomasz:

    @Chris We’ve got 3 servers with WF and the last one went fully online the night after however every thing seems to be very unstable and performing poorly. So in this case I can say that our servers are the lucky ones not like many others.

    Well… From my experience you need a lot of luck when you’re with Webfusion or Donhost.

    BTW Chris did they managed to fix it for you?

  • Andy:

    Here we are at 10pm Wednesday night and my dedicated Webfusion server is still down – since 11am Monday. Several phone calls and online submissions followed by promised actions that have not materialised. The picture of the resolved situation suggested by the wording on the status page is certainly not true in my experience.

  • Andy, can you email me your server number and I can look in to this for you.

    As a general note, when you pull the power to a server, it coming back online again may or may not work. This can be hardware related to the power spike when turned off and on again, or software, with updates to kernels and similar, not live as the server has not rebooted in some time. There are also file system checks to consider. We have been bringing individual servers online through both our monitoring, as well as esclations from customers. Last I heard less than a handful were having issues, and our DC team are in contact wtih those customers.

    Given this, knowing the dedicated server details will allow me to escalate to the correct team.

    Thanks, Richard.

  • Tony:

    I have 162 sites, all were down for more than 40 hours. This cost my clients tens of thousands of pounds and made many very angry with me. Ironic that my primary domain is the ONLY site still down, this means I cannot FTP (alter), move or redirect ANY of the other sites.

    Is this a deliberate block? My Solicitors would like to know if anyone else experienced the same?

  • Carl:

    Twelve days after the initial problem I’m still getting intermittent “Access to the path “c:\windows\microsoft.net\framework\v1.1.4322\Temporary ASP.NET Files\root\32a31cef\9b0d7332″ is denied” messages on my site (hosted by WebFusion). The most recent update from Customer Support was “Our engineers are aware of this problem and will correct this for you as soon as possible, however I cannot give a timescale for completion I’m afraid.”

  • Tomasz:

    Why term “as soon as possible” is used in official explanations? It simply means “Whenever we want to” or “Maybe some time in the future…”. Ask them “Then when this will be possible?” usually they come up with two answers: “I don’t know” or “as soon as possible”. When I first heard this explanation from Webfusion I though they are joking. And why they never know when something is going to be fixed. This clearly indictes that in case of Webfusion it is a matter of luck rather than professional process to be completed.
    Have you ever tried to manage a project without deadlines? That would be fun to use ASAP instead of date and time. “Director: Why did we missed the deadline? Manager: “Sir. It was not possible yet but it will be soon.”

  • Hi there, just changed into aware of your weblog thru Google, and located that it is truly informative. I am going to watch out for brussels. I will appreciate if you continue this in future. Lots of other people will be benefited from your writing. Cheers!

  • I do consider all of the concepts you’ve offered in your post. They are very convincing and will certainly work. Nonetheless, the posts are too short for beginners. May just you please prolong them a little from subsequent time? Thank you for the post.

Leave a Reply

We often have special offers and give freebies for our subscribers so it is worth to be on our mailing list!