[Resolved] Emergency Network Maintenance
Posted: June 2, 2011 Filed under: Uncategorized Leave a comment »Date: 06-02-2011
Start time: 11:20
Services Affected: Washington, DC – Public Network
Event Summary:
Webbynode Engineers along with Cisco TAC have identified a software bug on FCR01.WDC01 which is causing forwarding issues for public connectivity. Engineers will be performing an EMERGENCY code change on Jun 2nd, 2011 at 10:00 PM EDT to resolve this issue. The expected downtime is 20 minutes with the maintenance window being scheduled for up to 4 hours.
Start Time: 10:00pm EDT (6/2/2011)
End Time: 02:00am EDT (6/3/2011)
Expected Duration: 20 minutes
Customer Impact:
During this maintenance, customers will notice a complete loss of connectivity to their servers on the frontend network (public network). Backend network (private network) connectivity will NOT be impacted during this maintenance. While the upgrade duration is scheduled for 4 hours, we only expect around 20 minutes of downtime as the code is changed. Again, this will NOT impact the backend network (private network) for customer servers.
Best,
Webbynode
UPDATE 11:00PM EST: Connectivity has been restored at this point, but the maintenance window is still going. Please open a ticket if you still have issues.
[Resolved] Wash-04 Network Outage
Posted: May 24, 2011 Filed under: Uncategorized Leave a comment »We are investigating an outage in our Wash DC datacenter that is currently affecting the Wash-04 node. All other Wash DC nodes are up and working fine.
As we find out the network status of Wash-04 we will update this post.
UPDATE 4:55 PM After rebooting the core switch Wash-04 was connected to and a node reboot connectivity has been restored.
If anyone is still seeing issues please submit a ticket.
** We are seeing network connectivity issues from our customers in Brazil to all datacenters. We aren’t sure what the cause is, but hold tight, connectivity should be restored soon.
If we find out what is causing the connectivity problem we will let everyone know.
[Resolved] Network Outage Miami DC
Posted: May 21, 2011 Filed under: Uncategorized Leave a comment »We are currently investigating a partial network outage that is effecting some of the nodes in the Miami datacenter.
UPDATE 12:55 PM: We have brought the network back online for all nodes that have lost connectivity. All nodes remained online during the network outage.
If anyone has any issues from the loss of connectivity please submit a ticket.
[Finished] Emergency maintenance on Wash-04
Posted: May 10, 2011 Filed under: System Status Leave a comment »We are experiencing an issue in Wash-04, the server will be coming back shortly. Updates will come as more information is available.
Update 10:55AM – Wash-04 is back online. All Webbies are coming back up. We hit a Xen kernel bug and had to restart the server.
[Resolved] DoS Attack Miami-B
Posted: May 7, 2011 Filed under: Uncategorized Leave a comment »We are working on mitigating an incoming attack in the Miami-B datacenter. Please wait for more updates as information becomes available.
Update: 2:30 PM: Attack has been mitigated, we’re back online. All Webbies are online. If you are having any issues please send a ticket.
[Resolved] Miami-B Datacenter Connectivity Issues
Posted: May 5, 2011 Filed under: System Status 5 Comments »We are experiencing a network issue in our Miami-DC. We are having intermittent network connectivity and packet loss.
We are working on it right now, we will update this post as soon as more information is available.
UPDATE 6:15PM EST: We are still having issues with connectivity, we have all hands on deck working on the problem. More updates following.
UPDATE 6:30PM EST: We are still working on this issue. Please hang on tight. We are working as fast as we can to get this resolved.
UPDATE 7:00PM EST: The network is now up. All Webbies are also up, and did not go down. We are are getting more information in detail so we can update this ticket.
UPDATE 7:55PM EST: The initial report is that we experienced an massive inbound DoS service attack, which seems to have been related to our primary uplink provider Cogent as a massive packet flood to our network. During this attack our core Cisco routers also locked up due to an un-fixed module bug by Cisco and prevented our main uplink to failover to our secondary which is Level3. After we changed everything to Level3 we came back up again which makes us believe that the packet flood was an issue created directly by Cogent and caused a Denial of Service to us. We are now on Level3 as a we wait for a report from Cogent and we will update this blog post when that information becomes available.
[Resolved] Washington-03 Maintenance
Posted: May 5, 2011 Filed under: System Status Leave a comment »As of 12:30AM EST we are investigating some server lockups that have cause the intermittent uptime. At this time we are investigating whether we had a RAID controller issue and we had to reboot the server. As of this time the server is up and running and operational, however we are now doing more investigations to solve the kernel panics we are receiving. If the fix doesn’t solve the problem we will be swapping the ram on the server even though the server reports no RAM errors.
If the server continues to give us more issues, we will then schedule a migration of every single customer to a new server.
Note: As of 12:30 AM EST All Webbies are up and running. We will update the Status post as we perform any reboots or changes.
Update: 3PM EST – We will be scheduling a memory swap on this server shortly. We will post the schedule here.
[Resolved] Wash-03 Maintenance
Posted: May 4, 2011 Filed under: Uncategorized Leave a comment »We are performing some maintenance on Wash-03 at the moment. Should be back up shortly. More to follow.
## UPDATE ##
18:56 EST
All webbies are now back online after a system reboot.
## UPDATE ##
19:08 EST
If anyone is experiencing any problems, please submit a support ticket.
[Resolved] Washington DC Network outages
Posted: May 2, 2011 Filed under: Uncategorized Leave a comment »We are currently working through some network issues in our Washington DC datacenter. We are experience intermittent connectivity and some packet loss.
More updates to follow as we find out what’s happening from our network engineers.
** UPDATE **
DC is reporting DoS attack, they are currently mitigating and the network should be stable back soon.
We had received a large DoS attack towards the Wash-01 datacenter that caused increased latency and packetloss for some customer servers beginning 8:40AM CDT (13h40 UTC) 2011-MAY-02. The victim IPs were null routed and the issue mitigated by 8:57AM CDT (13h57 UTC) 2011-MAY-02 and all services are returned to normal at this time.
[Resolved] Wash-03 Maintenance
Posted: April 28, 2011 Filed under: Uncategorized Leave a comment »Hello,
We experienced a hardware issue on our raid on Wash-03, the server is up and running now. However we are investigating the matter closely, we will be updating this post as more information becomes available.