Facebook Outage Reminds Us Why Networks Need Cloud
October 5, 2021 - Julia Hogarty
Julia Hogarty, Product Marketing Manager, talks about the worldwide Facebook outage and the dependency on data centres to juggle network load.
Across Monday and Tuesday, depending on where you are in the world, Facebook and its platform suite (including Instagram, Messenger and WhatsApp) were hit by a global outage for almost 6 hours. In a statement, Facebook attributed the failure to ‘a configuration change to the backbone routers that coordinate network traffic between the company’s data centres’ which had a ‘cascading effect on the way our data centres communicate, bringing our services to a halt.’ To be clear, this meant all services run by Facebook were down for an unprecedented length of time. Some might see the idea of many of the world’s social media platforms being out of action as some kind of welcome relief, but many didn’t.
Dependency on data centres to juggle network load has long been a persistent issue for service providers to contend with. This is primarily due to the considerable cost that comes with deploying and managing hardware in a data centre environment. However, another big challenge for service providers is when the demand for data-rich services fluctuates and forces a spike in traffic, physical data centres have traditionally struggled to allow for the scaling flexibility needed to absorb significant load anomalies. Geo-redundancy became important to allow for services to be restored quickly after an unplanned disruption by initiating a failover to a back-up data centre. This capability then became more intelligent with automation in time, but the risk of an outage is still very real as Facebook will attest.
The move to the cloud has transformed how services are delivered. The cloudification of the network has heralded the dawn of self-managing networks and the fear of service outages is becoming less acute than it once was. A network fully hosted in the cloud allows for traffic loads to undulate freely while the underlying network infrastructure adapts dynamically to network demands. Should there be a spike in traffic, the network is able to allocate additional resources to withstand a period of higher demand on the service. Then, once the traffic subsides, the network is able to scale down again to save costs. This ability to dynamically instantiate network resources in real-time is all made possible by the cloud and all its inherent flexibility. Allowing for high availability to deliver guaranteed continuity of service to remain competitive, which also saving in Capex outlay.
As service providers look to rollout their 5G networks, automation is becoming central to the discussion. The proactive management of data across the network with 5G volumes will be even more critical. Network Data Analytics Function (NWDAF) is set to be key in the realisation of true network automation, enabling the flexibility of the cloudified network to reach its full potential by automatically adapting the shape of the network in response to predicative load modelling. A long way from contending with a data centre fall over and not that far away.