[Finished] Emergency maintenance on Wash-04

We are experiencing an issue in Wash-04, the server will be coming back shortly. Updates will come as more information is available.

Update 10:55AM – Wash-04 is back online. All Webbies are coming back up. We hit a Xen kernel bug and had to restart the server.


[Resolved] Miami-B Datacenter Connectivity Issues

We are experiencing a network issue in our Miami-DC. We are having intermittent network connectivity and packet loss.

We are working on it right now, we will update this post as soon as more information is available.

UPDATE 6:15PM EST: We are still having issues with connectivity, we have all hands on deck working on the problem. More updates following.

UPDATE 6:30PM EST: We are still working on this issue. Please hang on tight. We are working as fast as we can to get this resolved.

UPDATE 7:00PM EST: The network is now up. All Webbies are also up, and did not go down. We are are getting more information in detail so we can update this ticket.

UPDATE 7:55PM EST: The initial report is that we experienced an massive inbound DoS service attack, which seems to have been related to our primary uplink provider Cogent as a massive packet flood to our network. During this attack our core Cisco routers also locked up due to an un-fixed module bug by Cisco and prevented our main uplink to failover to our secondary which is Level3. After we changed everything to Level3 we came back up again which makes us believe that the packet flood was an issue created directly by Cogent and caused a Denial of Service to us. We are now on Level3 as a we wait for a report from Cogent and we will update this blog post when that information becomes available.


[Resolved] Washington-03 Maintenance

As of 12:30AM EST we are investigating some server lockups that have cause the intermittent uptime. At this time we are investigating whether we had a RAID controller issue and we had to reboot the server. As of this time the server is up and running and operational, however we are now doing more investigations to solve the kernel panics we are receiving. If the fix doesn’t solve the problem we will be swapping the ram on the server even though the server reports no RAM errors.

If the server continues to give us more issues, we will then schedule a migration of every single customer to a new server.

Note: As of 12:30 AM EST All Webbies are up and running. We will update the Status post as we perform any reboots or changes.

Update: 3PM EST – We will be scheduling a memory swap on this server shortly. We will post the schedule here.


Emergency Miami-B Server Reboots.

We have finished migrating all nodes in the Miami-B site on to the new power grid. Each node was turned off, circuits moved over, and started back up. All Webbies should be up, please get in touch with Support via the Webby Manager if your Webby is still down.

Sorry for this hassle, but we have to take this precaution in order to avoid further power issues.

Thanks,

Carlos.


Miami-b08 Emergency Maintenance

We are currently performing an emergency maintenance on miami-b08. At this time we are working on moving everyone off of miami-b08 to a new node. All Webbies are safe, we are taking all the necessary steps to ensure no data loss.


[COMPLETED] Manager Maintenance

Manager is down for maintenance. We will post back with progress or a completion update.

Spoke too soon. Maintenance is completed early.

Please let us know if you are having any issues.


[Resolved] Datacenter Network Failure

We’re currently experiencing network failure from within our datacenter, we have technichans over to check it out, please stand by. We’ll be updating this blog post.

Our datacenter had a power surge. All Webbies are getting back online. Please open a ticket in case you’re still having problems.

UPDATE 5:07PM EST: Only webbies in nodes miami-b06 and miami-b11 are still down. Webby Manager is still down as well. We have all hands on deck working to fix this.

UPDATE 5:17PM EST: miami-b11 is back up. Working on getting miami-b06 back up.

UPDATE 6:10PM EST: Webby Manager is running again. We are having problems with the host system in miami-b06 so please bare with us for a little longer.

UPDATE 8:20PM EST: We have just finished verifying all data on miami-b06 and all Webbies are now starting to come back online. Please open a ticket if you are still experiencing any issues.


[RESOLVED] Connectivity problems on 67.23.79.* IP range

[0:30AM] We’re having connectivity issues with Webbies in the 67.23.79.* range. We have all hands on deck working to fix the problem.

[0:45AM] It was a Level3 Connectivity problem that is now resolved.

All Webbies are up and running.


Level3 Connectivity

[07:30PM EST] There was some packet loss at Atlanta’s Level 3 router. A ticket was opened and the problem is now solved. All Webbies remained up during the network outage.


DC Power Supply problem

[1PM EST] Our DC had an UPS overload and that caused a reboot of nodes miami-b01, b02, b03, b12, b13, b15 and b16. All webbies on those nodes were rebooted. Approximate downtime was 7 minutes and all Webbies are back up.

Please open a ticket if you are still experiencing problems.


Follow

Get every new post delivered to your Inbox.