Microsoft Blames Last Week’s Azure Outage on a Configuration Error - hollowayblighte76
A system configuration slip caused the outage that affected Windows Azure customers in western Europe last week, according to Microsoft.
As a resultant, the Microsoft state-supported befog application hosting and developing platform was unavailable for about 2 and a half hours on Thursday. Microsoft didn't say how many a customers were impacted.
At issue was a "condom valve" mechanism in the Azure network base configured to prevent cascading network failures. It does so aside capping the number of connections that network computer hardware devices take up.
"Prior to this incident, we added new capacity to the West Europe sub-part in response to multiplied demand. However, the limit in corresponding devices was not focused during the validation process to touch this modern capacity," wrote Mike Neil, Windows Chromatic imprecise manager, in a blog post.
A emergent rise in the contrived cluster's usage led to the "escape cock" threshold being exceeded, which generated a storm of network management alerts. "The increased management dealings in turn triggered bugs in both of the cluster's hardware devices, causing these to reach into 100% CPU utilization impacting data traffic," Neil wrote.
Squashing Bugs
At the time, Microsoft resolved the job by augmentative the stage-struck cluster's "safe valve" limits. To foreclose the situation from recurring, Microsoft is patching the identified bugs in the networking hardware devices, and it is as wel improving the meshwork monitoring systems, so that they can identify and address connectivity issues earlier they cause outages.
Forrester Research psychoanalyst James Staten said that PaaS (platform as a divine service) clouds such as Azure are very thickening and highly automated environments, and sometimes glitches crop up in production that crapper't be awaited in test environments. "This appears to be one of those cases," he said via email.
Over time as new features, greater use and another factors infix the equivalence, administrators have to consider steps to adjust and optimize the running system, and occasionally something testament break, He said.
"Should it be something clients should be implicated about? Not really. Information technology is an example of the kinds of things that can happen in a cloud environment. But far worse things are more grassroots in a typical enterprise data central," Staten said.
IT chiefs and developers planning to host applications in the cloud postulate to configure them and purpose them to be fault tolerant. "That is a of import shift in thinking most developers and enterprise operations teams need to sympathize when embarking on cloud deployments," he aforesaid.
"These types of outages are learning opportunities for some the cloud up admins and cloud customers. Rather than see these incidents as indictments of cloud, they should be seen as opportunities to better your use of the cloud," helium added.
Juan Carlos Perez covers enterprise communicating/collaboration suites, operational systems, browsers and world-wide engineering breaking news for The IDG News Service. Follow Juan on Twitter at @JuanCPerezIDG.
Source: https://www.pcworld.com/article/460418/microsoft_blames_last_weeks_azure_outage_on_a_configuration_error.html
Posted by: hollowayblighte76.blogspot.com
0 Response to "Microsoft Blames Last Week’s Azure Outage on a Configuration Error - hollowayblighte76"
Post a Comment